library(knitr)
source("../R/SFA.ExtractTopFeatures.R")
We perform gene annotations from the GTEx SFA analysis.
lambda_out <- read.table("../sfa_outputs/GTEX2013/counts_sqrt_gtex/counts_sqrt_gtex_lambda.out");
f_out <- t(read.table("../sfa_outputs/GTEX2013/counts_sqrt_gtex/counts_sqrt_gtex_F.out"));
gene_names <- as.vector(as.matrix(read.table("../sfa_inputs/gene_names_GTEX_V6.txt")));
gene_names <- substring(gene_names,1,15);
xli <- gene_names;
indices_mat <- SFA.ExtractTopFeatures(f_out, top_features = 100, options="min", mult.annotate = TRUE)
gene_list <- do.call(rbind, lapply(1:dim(indices_mat)[1], function(x) gene_names[indices_mat[x,]]))
out <- mygene::queryMany(gene_list[1,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| query | symbol | summary | X_id | name |
|---|---|---|---|---|
| ENSG00000171401 | KRT13 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | 3860 | keratin 13 |
| ENSG00000170477 | KRT4 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3851 | keratin 4 |
| ENSG00000163209 | SPRR3 | NA | 6707 | small proline rich protein 3 |
| ENSG00000163220 | S100A9 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and altered expression of this protein is associated with the disease cystic fibrosis. This antimicrobial protein exhibits antifungal and antibacterial activity. | 6280 | S100 calcium binding protein A9 |
| ENSG00000229732 | AC019349.5 | NA | ENSG00000229732 | NA |
| ENSG00000205420 | KRT6A | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. As many as six of this type II cytokeratin (KRT6) have been identified; the multiplicity of the genes is attributed to successive gene duplication events. The genes are expressed with family members KRT16 and/or KRT17 in the filiform papillae of the tongue, the stratified epithelial lining of oral mucosa and esophagus, the outer root sheath of hair follicles, and the glandular epithelia. This KRT6 gene in particular encodes the most abundant isoform. Mutations in these genes have been associated with pachyonychia congenita. In addition, peptides from the C-terminal region of the protein have antimicrobial activity against bacterial pathogens. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3853 | keratin 6A |
| ENSG00000135046 | ANXA1 | This gene encodes a membrane-localized protein that binds phospholipids. This protein inhibits phospholipase A2 and has anti-inflammatory activity. Loss of function or expression of this gene has been detected in multiple tumors. | 301 | annexin A1 |
| ENSG00000143546 | S100A8 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and as a cytokine. Altered expression of this protein is associated with the disease cystic fibrosis. Multiple transcript variants encoding different isoforms have been found for this gene. | 6279 | S100 calcium binding protein A8 |
| ENSG00000143536 | CRNN | This gene encodes a member of the ‘fused gene’ family of proteins, which contain N-terminus EF-hand domains and multiple tandem peptide repeats. The encoded protein contains two EF-hand Ca2+ binding domains in its N-terminus and two glutamine- and threonine-rich 60 amino acid repeats in its C-terminus. This gene, also known as squamous epithelial heat shock protein 53, may play a role in the mucosal/epithelial immune response and epidermal differentiation. | 49860 | cornulin |
| ENSG00000140519 | RHCG | NA | 51458 | Rh family C glycoprotein |
| ENSG00000186081 | KRT5 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the basal layer of the epidermis with family member KRT14. Mutations in these genes have been associated with a complex of diseases termed epidermolysis bullosa simplex. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3852 | keratin 5 |
| ENSG00000160213 | CSTB | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins and kininogens. This gene encodes a stefin that functions as an intracellular thiol protease inhibitor. The protein is able to form a dimer stabilized by noncovalent forces, inhibiting papain and cathepsins l, h and b. The protein is thought to play a role in protecting against the proteases leaking from lysosomes. Evidence indicates that mutations in this gene are responsible for the primary defects in patients with progressive myoclonic epilepsy (EPM1). | 1476 | cystatin B |
| ENSG00000118898 | PPL | The protein encoded by this gene is a component of desmosomes and of the epidermal cornified envelope in keratinocytes. The N-terminal domain of this protein interacts with the plasma membrane and its C-terminus interacts with intermediate filaments. Through its rod domain, this protein forms complexes with envoplakin. This protein may serve as a link between the cornified envelope and desmosomes as well as intermediate filaments. AKT1/PKB, a protein kinase mediating a variety of cell growth and survival signaling processes, is reported to interact with this protein, suggesting a possible role for this protein as a localization signal in AKT1-mediated signaling. | 5493 | periplakin |
| ENSG00000134531 | EMP1 | NA | 2012 | epithelial membrane protein 1 |
| ENSG00000107317 | PTGDS | The protein encoded by this gene is a glutathione-independent prostaglandin D synthase that catalyzes the conversion of prostaglandin H2 (PGH2) to postaglandin D2 (PGD2). PGD2 functions as a neuromodulator as well as a trophic factor in the central nervous system. PGD2 is also involved in smooth muscle contraction/relaxation and is a potent inhibitor of platelet aggregation. This gene is preferentially expressed in brain. Studies with transgenic mice overexpressing this gene suggest that this gene may be also involved in the regulation of non-rapid eye movement sleep. | 5730 | prostaglandin D2 synthase |
| ENSG00000197971 | MBP | The protein encoded by the classic MBP gene is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. However, MBP-related transcripts are also present in the bone marrow and the immune system. These mRNAs arise from the long MBP gene (otherwise called ‘Golli-MBP’) that contains 3 additional exons located upstream of the classic MBP exons. Alternative splicing from the Golli and the MBP transcription start sites gives rise to 2 sets of MBP-related transcripts and gene products. The Golli mRNAs contain 3 exons unique to Golli-MBP, spliced in-frame to 1 or more MBP exons. They encode hybrid proteins that have N-terminal Golli aa sequence linked to MBP aa sequence. The second family of transcripts contain only MBP exons and produce the well characterized myelin basic proteins. This complex gene structure is conserved among species suggesting that the MBP transcription unit is an integral part of the Golli transcription unit and that this arrangement is important for the function and/or regulation of these genes. | 4155 | myelin basic protein |
| ENSG00000121552 | CSTA | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins, and kininogens. This gene encodes a stefin that functions as a cysteine protease inhibitor, forming tight complexes with papain and the cathepsins B, H, and L. The protein is one of the precursor proteins of cornified cell envelope in keratinocytes and plays a role in epidermal development and maintenance. Stefins have been proposed as prognostic and diagnostic tools for cancer. | 1475 | cystatin A |
| ENSG00000111640 | GAPDH | This gene encodes a member of the glyceraldehyde-3-phosphate dehydrogenase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. The product of this gene catalyzes an important energy-yielding step in carbohydrate metabolism, the reversible oxidative phosphorylation of glyceraldehyde-3-phosphate in the presence of inorganic phosphate and nicotinamide adenine dinucleotide (NAD). The encoded protein has additionally been identified to have uracil DNA glycosylase activity in the nucleus. Also, this protein contains a peptide that has antimicrobial activity against E. coli, P. aeruginosa, and C. albicans. Studies of a similar protein in mouse have assigned a variety of additional functions including nitrosylation of nuclear proteins, the regulation of mRNA stability, and acting as a transferrin receptor on the cell surface of macrophage. Many pseudogenes similar to this locus are present in the human genome. Alternative splicing results in multiple transcript variants. | 2597 | glyceraldehyde-3-phosphate dehydrogenase |
| ENSG00000143369 | ECM1 | This gene encodes a soluble protein that is involved in endochondral bone formation, angiogenesis, and tumor biology. It also interacts with a variety of extracellular and structural proteins, contributing to the maintenance of skin integrity and homeostasis. Mutations in this gene are associated with lipoid proteinosis disorder (also known as hyalinosis cutis et mucosae or Urbach-Wiethe disease) that is characterized by generalized thickening of skin, mucosae and certain viscera. Alternatively spliced transcript variants encoding distinct isoforms have been described for this gene. | 1893 | extracellular matrix protein 1 |
| ENSG00000065978 | YBX1 | This gene encodes a highly conserved cold shock domain protein that has broad nucleic acid binding properties. The encoded protein functions as both a DNA and RNA binding protein and has been implicated in numerous cellular processes including regulation of transcription and translation, pre-mRNA splicing, DNA reparation and mRNA packaging. This protein is also a component of messenger ribonucleoprotein (mRNP) complexes and may have a role in microRNA processing. This protein can be secreted through non-classical pathways and functions as an extracellular mitogen. Aberrant expression of the gene is associated with cancer proliferation in numerous tissues. This gene may be a prognostic marker for poor outcome and drug resistance in certain cancers. Alternate splicing results in multiple transcript variants. Pseudogenes of this gene are found on multiple chromosomes. | 4904 | Y-box binding protein 1 |
| ENSG00000133710 | SPINK5 | This gene encodes a multidomain serine protease inhibitor that contains 15 potential inhibitory domains. The encoded preproprotein is proteolytically processed to generate multiple protein products, which may exhibit unique activities and specificities. These proteins may play a role in skin and hair morphogenesis, as well as anti-inflammatory and antimicrobial protection of mucous epithelia. Mutations in this gene may result in Netherton syndrome, a disorder characterized by ichthyosis, defective cornification, and atopy. This gene is present in a gene cluster on chromosome 5. Alternative splicing results in multiple transcript variants. | 11005 | serine peptidase inhibitor, Kazal type 5 |
| ENSG00000163017 | ACTG2 | Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | 72 | actin, gamma 2, smooth muscle, enteric |
| ENSG00000125780 | TGM3 | Transglutaminases are enzymes that catalyze the crosslinking of proteins by epsilon-gamma glutamyl lysine isopeptide bonds. While the primary structure of transglutaminases is not conserved, they all have the same amino acid sequence at their active sites and their activity is calcium-dependent. The protein encoded by this gene consists of two polypeptide chains activated from a single precursor protein by proteolysis. The encoded protein is involved the later stages of cell envelope formation in the epidermis and hair follicle. | 7053 | transglutaminase 3 |
| ENSG00000124942 | AHNAK | NA | 79026 | AHNAK nucleoprotein |
| ENSG00000060138 | YBX3 | NA | 8531 | Y-box binding protein 3 |
| ENSG00000047849 | MAP4 | The protein encoded by this gene is a major non-neuronal microtubule-associated protein. This protein contains a domain similar to the microtubule-binding domains of neuronal microtubule-associated protein (MAP2) and microtubule-associated protein tau (MAPT/TAU). This protein promotes microtubule assembly, and has been shown to counteract destabilization of interphase microtubule catastrophe promotion. Cyclin B was found to interact with this protein, which targets cell division cycle 2 (CDC2) kinase to microtubules. The phosphorylation of this protein affects microtubule properties and cell cycle progression. Multiple transcript variants encoding different isoforms have been found for this gene. | 4134 | microtubule associated protein 4 |
| ENSG00000189334 | S100A14 | This gene encodes a member of the S100 protein family which contains an EF-hand motif and binds calcium. The gene is located in a cluster of S100 genes on chromosome 1. Levels of the encoded protein have been found to be lower in cancerous tissue and associated with metastasis suggesting a tumor suppressor function (PMID: 19956863, 19351828). | 57402 | S100 calcium binding protein A14 |
| ENSG00000009307 | CSDE1 | NA | 7812 | cold shock domain containing E1 |
| ENSG00000080824 | HSP90AA1 | The protein encoded by this gene is an inducible molecular chaperone that functions as a homodimer. The encoded protein aids in the proper folding of specific target proteins by use of an ATPase activity that is modulated by co-chaperones. Two transcript variants encoding different isoforms have been found for this gene. | 3320 | heat shock protein 90kDa alpha family class A member 1 |
| ENSG00000174437 | ATP2A2 | This gene encodes one of the SERCA Ca(2+)-ATPases, which are intracellular pumps located in the sarcoplasmic or endoplasmic reticula of muscle cells. This enzyme catalyzes the hydrolysis of ATP coupled with the translocation of calcium from the cytosol into the sarcoplasmic reticulum lumen, and is involved in regulation of the contraction/relaxation cycle. Mutations in this gene cause Darier-White disease, also known as keratosis follicularis, an autosomal dominant skin disorder characterized by loss of adhesion between epidermal cells and abnormal keratinization. Alternative splicing results in multiple transcript variants encoding different isoforms. | 488 | ATPase sarcoplasmic/endoplasmic reticulum Ca2+ transporting 2 |
| ENSG00000171345 | KRT19 | The protein encoded by this gene is a member of the keratin family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. The type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. Unlike its related family members, this smallest known acidic cytokeratin is not paired with a basic cytokeratin in epithelial cells. It is specifically expressed in the periderm, the transiently superficial layer that envelopes the developing epidermis. The type I cytokeratins are clustered in a region of chromosome 17q12-q21. | 3880 | keratin 19 |
| ENSG00000169474 | SPRR1A | NA | 6698 | small proline rich protein 1A |
| ENSG00000152556 | PFKM | Three phosphofructokinase isozymes exist in humans: muscle, liver and platelet. These isozymes function as subunits of the mammalian tetramer phosphofructokinase, which catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-bisphosphate. Tetramer composition varies depending on tissue type. This gene encodes the muscle-type isozyme. Mutations in this gene have been associated with glycogen storage disease type VII, also known as Tarui disease. Alternatively spliced transcript variants have been described. | 5213 | phosphofructokinase, muscle |
| ENSG00000165272 | AQP3 | This gene encodes the water channel protein aquaporin 3. Aquaporins are a family of small integral membrane proteins related to the major intrinsic protein, also known as aquaporin 0. Aquaporin 3 is localized at the basal lateral membranes of collecting duct cells in the kidney. In addition to its water channel function, aquaporin 3 has been found to facilitate the transport of nonionic small solutes such as urea and glycerol, but to a smaller degree. It has been suggested that water channels can be functionally heterogeneous and possess water and solute permeation mechanisms. Alternative splicing of this gene results in multiple transcript variants encoding different isoforms. | 360 | aquaporin 3 (Gill blood group) |
| ENSG00000184292 | TACSTD2 | This intronless gene encodes a carcinoma-associated antigen. This antigen is a cell surface receptor that transduces calcium signals. Mutations of this gene have been associated with gelatinous drop-like corneal dystrophy. | 4070 | tumor-associated calcium signal transducer 2 |
| ENSG00000241794 | SPRR2A | NA | 6700 | small proline rich protein 2A |
| ENSG00000170315 | UBB | This gene encodes ubiquitin, one of the most conserved proteins known. Ubiquitin has a major role in targeting cellular proteins for degradation by the 26S proteosome. It is also involved in the maintenance of chromatin structure, the regulation of gene expression, and the stress response. Ubiquitin is synthesized as a precursor protein consisting of either polyubiquitin chains or a single ubiquitin moiety fused to an unrelated protein. This gene consists of three direct repeats of the ubiquitin coding sequence with no spacer sequence. Consequently, the protein is expressed as a polyubiquitin precursor with a final amino acid after the last repeat. An aberrant form of this protein has been detected in patients with Alzheimer’s disease and Down syndrome. Pseudogenes of this gene are located on chromosomes 1, 2, 13, and 17. Alternative splicing results in multiple transcript variants. | 7314 | ubiquitin B |
| ENSG00000136689 | IL1RN | The protein encoded by this gene is a member of the interleukin 1 cytokine family. This protein inhibits the activities of interleukin 1, alpha (IL1A) and interleukin 1, beta (IL1B), and modulates a variety of interleukin 1 related immune and inflammatory responses. This gene and five other closely related cytokine genes form a gene cluster spanning approximately 400 kb on chromosome 2. A polymorphism of this gene is reported to be associated with increased risk of osteoporotic fractures and gastric cancer. Several alternatively spliced transcript variants encoding distinct isoforms have been reported. | 3557 | interleukin 1 receptor antagonist |
| ENSG00000114416 | FXR1 | The protein encoded by this gene is an RNA binding protein that interacts with the functionally-similar proteins FMR1 and FXR2. These proteins shuttle between the nucleus and cytoplasm and associate with polyribosomes, predominantly with the 60S ribosomal subunit. Three transcript variants encoding different isoforms have been found for this gene. | 8087 | FMR1 autosomal homolog 1 |
| ENSG00000171346 | KRT15 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains and are clustered in a region on chromosome 17q21.2. | 3866 | keratin 15 |
| ENSG00000136811 | ODF2 | The outer dense fibers are cytoskeletal structures that surround the axoneme in the middle piece and principal piece of the sperm tail. The fibers function in maintaining the elastic structure and recoil of the sperm tail as well as in protecting the tail from shear forces during epididymal transport and ejaculation. Defects in the outer dense fibers lead to abnormal sperm morphology and infertility. This gene encodes one of the major outer dense fiber proteins. Alternative splicing results in multiple transcript variants. The longer transcripts, also known as ‘Cenexins’, encode proteins with a C-terminal extension that are differentially targeted to somatic centrioles and thought to be crucial for the formation of microtubule organizing centers. | 4957 | outer dense fiber of sperm tails 2 |
| ENSG00000126777 | KTN1 | This gene encodes an integral membrane protein that is a member of the kinectin protein family. The encoded protein is primarily localized to the endoplasmic reticulum membrane. This protein binds kinesin and may be involved in intracellular organelle motility. This protein also binds translation elongation factor-delta and may be involved in the assembly of the elongation factor-1 complex. Alternate splicing results in multiple transcript variants of this gene. | 3895 | kinectin 1 |
| ENSG00000120885 | CLU | The protein encoded by this gene is a secreted chaperone that can under some stress conditions also be found in the cell cytosol. It has been suggested to be involved in several basic biological events such as cell death, tumor progression, and neurodegenerative disorders. Alternate splicing results in both coding and non-coding variants. | 1191 | clusterin |
| ENSG00000175793 | SFN | NA | 2810 | stratifin |
| ENSG00000198467 | TPM2 | This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 7169 | tropomyosin 2 (beta) |
| ENSG00000163191 | S100A11 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in motility, invasion, and tubulin polymerization. Chromosomal rearrangements and altered expression of this gene have been implicated in tumor metastasis. | 6282 | S100 calcium binding protein A11 |
| ENSG00000160014 | CALM3 | NA | 808 | calmodulin 3 (phosphorylase kinase, delta) |
| ENSG00000160014 | CALM2 | This gene is a member of the calmodulin gene family. There are three distinct calmodulin genes dispersed throughout the genome that encode the identical protein, but differ at the nucleotide level. Calmodulin is a calcium binding protein that plays a role in signaling pathways, cell cycle progression and proliferation. Several infants with severe forms of long-QT syndrome (LQTS) who displayed life-threatening ventricular arrhythmias together with delayed neurodevelopment and epilepsy were found to have mutations in either this gene or another member of the calmodulin gene family (PMID:23388215). Mutations in this gene have also been identified in patients with less severe forms of LQTS (PMID:24917665), while mutations in another calmodulin gene family member have been associated with catecholaminergic polymorphic ventricular tachycardia (CPVT)(PMID:23040497), a rare disorder thought to be the cause of a significant fraction of sudden cardiac deaths in young individuals. Pseudogenes of this gene are found on chromosomes 10, 13, and 17. Alternative splicing results in multiple transcript variants encoding different isoforms. | 805 | calmodulin 2 (phosphorylase kinase, delta) |
| ENSG00000178104 | PDE4DIP | The protein encoded by this gene serves to anchor phosphodiesterase 4D to the Golgi/centrosome region of the cell. Defects in this gene may be a cause of myeloproliferative disorder (MBD) associated with eosinophilia. Several transcript variants encoding different isoforms have been found for this gene. | 9659 | phosphodiesterase 4D interacting protein |
| ENSG00000154358 | OBSCN | The obscurin gene spans more than 150 kb, contains over 80 exons and encodes a protein of approximately 720 kDa. The encoded protein contains 68 Ig domains, 2 fibronectin domains, 1 calcium/calmodulin-binding domain, 1 RhoGEF domain with an associated PH domain, and 2 serine-threonine kinase domains. This protein belongs to the family of giant sacromeric signaling proteins that includes titin and nebulin, and may have a role in the organization of myofibrils during assembly and may mediate interactions between the sarcoplasmic reticulum and myofibrils. Alternatively spliced transcript variants encoding different isoforms have been identified. | 84033 | obscurin, cytoskeletal calmodulin and titin-interacting RhoGEF |
| ENSG00000065150 | IPO5 | Nucleocytoplasmic transport, a signal- and energy-dependent process, takes place through nuclear pore complexes embedded in the nuclear envelope. The import of proteins containing a nuclear localization signal (NLS) requires the NLS import receptor, a heterodimer of importin alpha and beta subunits also known as karyopherins. Importin alpha binds the NLS-containing cargo in the cytoplasm and importin beta docks the complex at the cytoplasmic side of the nuclear pore complex. In the presence of nucleoside triphosphates and the small GTP binding protein Ran, the complex moves into the nuclear pore complex and the importin subunits dissociate. Importin alpha enters the nucleoplasm with its passenger protein and importin beta remains at the pore. Interactions between importin beta and the FG repeats of nucleoporins are essential in translocation through the pore complex. The protein encoded by this gene is a member of the importin beta family. | 3843 | importin 5 |
| ENSG00000172005 | MAL | The protein encoded by this gene is a highly hydrophobic integral membrane protein belonging to the MAL family of proteolipids. The protein has been localized to the endoplasmic reticulum of T-cells and is a candidate linker protein in T-cell signal transduction. In addition, this proteolipid is localized in compact myelin of cells in the nervous system and has been implicated in myelin biogenesis and/or function. The protein plays a role in the formation, stabilization and maintenance of glycosphingolipid-enriched membrane microdomains. Down-regulation of this gene has been associated with a variety of human epithelial malignancies. Alternative splicing produces four transcript variants which vary from each other by the presence or absence of alternatively spliced exons 2 and 3. | 4118 | mal, T-cell differentiation protein |
| ENSG00000078674 | PCM1 | The protein encoded by this gene is a component of centriolar satellites, which are electron dense granules scattered around centrosomes. Inhibition studies show that this protein is essential for the correct localization of several centrosomal proteins, and for anchoring microtubules to the centrosome. Chromosomal aberrations involving this gene are associated with papillary thyroid carcinomas and a variety of hematological malignancies, including atypical chronic myeloid leukemia and T-cell lymphoma. Multiple transcript variants encoding different isoforms have been found for this gene. | 5108 | pericentriolar material 1 |
| ENSG00000149925 | ALDOA | The protein encoded by this gene, Aldolase A (fructose-bisphosphate aldolase), is a glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Three aldolase isozymes (A, B, and C), encoded by three different genes, are differentially expressed during development. Aldolase A is found in the developing embryo and is produced in even greater amounts in adult muscle. Aldolase A expression is repressed in adult liver, kidney and intestine and similar to aldolase C levels in brain and other nervous tissue. Aldolase A deficiency has been associated with myopathy and hemolytic anemia. Alternative splicing and alternative promoter usage results in multiple transcript variants. Related pseudogenes have been identified on chromosomes 3 and 10. | 226 | aldolase, fructose-bisphosphate A |
| ENSG00000092295 | TGM1 | The protein encoded by this gene is a membrane protein that catalyzes the addition of an alkyl group from an akylamine to a glutamine residue of a protein, forming an alkylglutamine in the protein. This protein alkylation leads to crosslinking of proteins and catenation of polyamines to proteins. This gene contains either one or two copies of a 22 nt repeat unit in its 3’ UTR. Mutations in this gene have been associated with autosomal recessive lamellar ichthyosis (LI) and nonbullous congenital ichthyosiform erythroderma (NCIE). | 7051 | transglutaminase 1 |
| ENSG00000143549 | TPM3 | This gene encodes a member of the tropomyosin family of actin-binding proteins. Tropomyosins are dimers of coiled-coil proteins that provide stability to actin filaments and regulate access of other actin-binding proteins. Mutations in this gene result in autosomal dominant nemaline myopathy and other muscle disorders. This locus is involved in translocations with other loci, including anaplastic lymphoma receptor tyrosine kinase (ALK) and neurotrophic tyrosine kinase receptor type 1 (NTRK1), which result in the formation of fusion proteins that act as oncogenes. There are numerous pseudogenes for this gene on different chromosomes. Alternative splicing results in multiple transcript variants. | 7170 | tropomyosin 3 |
| ENSG00000172270 | BSG | The protein encoded by this gene is a plasma membrane protein that is important in spermatogenesis, embryo implantation, neural network formation, and tumor progression. The encoded protein is also a member of the immunoglobulin superfamily. Multiple transcript variants encoding different isoforms have been found for this gene. | 682 | basigin (Ok blood group) |
| ENSG00000186395 | KRT10 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | 3858 | keratin 10 |
| ENSG00000185787 | MORF4L1 | NA | 10933 | mortality factor 4 like 1 |
| ENSG00000167468 | GPX4 | This gene encodes a member of the glutathione peroxidase protein family. Glutathione peroxidase catalyzes the reduction of hydrogen peroxide, organic hydroperoxide, and lipid peroxides by reduced glutathione and functions in the protection of cells against oxidative damage. Human plasma glutathione peroxidase has been shown to be a selenium-containing enzyme and the UGA codon is translated into a selenocysteine. The encoded protein has been identified as a moonlighting protein based on its ability to serve dual functions as a peroxidase as well as a structural protein in mature spermatozoa. Through alternative splicing and transcription initiation, rat produces proteins that localize to the nucleus, mitochondrion, and cytoplasm. In humans, alternative transcription initiation and the cleavage sites of the mitochondrial and nuclear transit peptides need to be experimentally verified. Alternative splicing results in multiple transcript variants. | 2879 | glutathione peroxidase 4 |
| ENSG00000129250 | KIF1C | The protein encoded by this gene is a member of the kinesin-like protein family. The family members are microtubule-dependent molecular motors that transport organelles within cells and move chromosomes during cell division. Mutations in this gene are a cause of spastic ataxia 2, autosomal recessive. | 10749 | kinesin family member 1C |
| ENSG00000134202 | GSTM3 | Cytosolic and membrane-bound forms of glutathione S-transferase are encoded by two distinct supergene families. At present, eight distinct classes of the soluble cytoplasmic mammalian glutathione S-transferases have been identified: alpha, kappa, mu, omega, pi, sigma, theta and zeta. This gene encodes a glutathione S-transferase that belongs to the mu class. The mu class of enzymes functions in the detoxification of electrophilic compounds, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress, by conjugation with glutathione. The genes encoding the mu class of enzymes are organized in a gene cluster on chromosome 1p13.3 and are known to be highly polymorphic. These genetic variations can change an individual’s susceptibility to carcinogens and toxins as well as affect the toxicity and efficacy of certain drugs. Mutations of this class mu gene have been linked with a slight increase in a number of cancers, likely due to exposure with environmental toxins. Alternative splicing results in multiple transcript variants. | 2947 | glutathione S-transferase mu 3 (brain) |
| ENSG00000196465 | MYL6B | Myosin is a hexameric ATPase cellular motor protein. It is composed of two heavy chains, two nonphosphorylatable alkali light chains, and two phosphorylatable regulatory light chains. This gene encodes a myosin alkali light chain expressed in both slow-twitch skeletal muscle and in nonmuscle tissue. Alternative splicing results in multiple transcript variants. | 140465 | myosin light chain 6B |
| ENSG00000153827 | TRIP12 | NA | 9320 | thyroid hormone receptor interactor 12 |
| ENSG00000109846 | CRYAB | Mammalian lens crystallins are divided into alpha, beta, and gamma families. Alpha crystallins are composed of two gene products: alpha-A and alpha-B, for acidic and basic, respectively. Alpha crystallins can be induced by heat shock and are members of the small heat shock protein (HSP20) family. They act as molecular chaperones although they do not renature proteins and release them in the fashion of a true chaperone; instead they hold them in large soluble aggregates. Post-translational modifications decrease the ability to chaperone. These heterogeneous aggregates consist of 30-40 subunits; the alpha-A and alpha-B subunits have a 3:1 ratio, respectively. Two additional functions of alpha crystallins are an autokinase activity and participation in the intracellular architecture. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. Alpha-A and alpha-B gene products are differentially expressed; alpha-A is preferentially restricted to the lens and alpha-B is expressed widely in many tissues and organs. Elevated expression of alpha-B crystallin occurs in many neurological diseases; a missense mutation cosegregated in a family with a desmin-related myopathy. Alternative splicing results in multiple transcript variants. | 1410 | crystallin alpha B |
| ENSG00000165474 | GJB2 | This gene encodes a member of the gap junction protein family. The gap junctions were first characterized by electron microscopy as regionally specialized structures on plasma membranes of contacting adherent cells. These structures were shown to consist of cell-to-cell channels that facilitate the transfer of ions and small molecules between cells. The gap junction proteins, also known as connexins, purified from fractions of enriched gap junctions from different tissues differ. According to sequence similarities at the nucleotide and amino acid levels, the gap junction proteins are divided into two categories, alpha and beta. Mutations in this gene are responsible for as much as 50% of pre-lingual, recessive deafness. | 2706 | gap junction protein beta 2 |
| ENSG00000128591 | FLNC | This gene encodes one of three related filamin genes, specifically gamma filamin. These filamin proteins crosslink actin filaments into orthogonal networks in cortical cytoplasm and participate in the anchoring of membrane proteins for the actin cytoskeleton. Three functional domains exist in filamin: an N-terminal filamentous actin-binding domain, a C-terminal self-association domain, and a membrane glycoprotein-binding domain. Two transcript variants encoding different isoforms have been found for this gene. | 2318 | filamin C |
| ENSG00000204469 | PRRC2A | A cluster of genes, BAT1-BAT5, has been localized in the vicinity of the genes for TNF alpha and TNF beta. These genes are all within the human major histocompatibility complex class III region. This gene has microsatellite repeats which are associated with the age-at-onset of insulin-dependent diabetes mellitus (IDDM) and possibly thought to be involved with the inflammatory process of pancreatic beta-cell destruction during the development of IDDM. This gene is also a candidate gene for the development of rheumatoid arthritis. Two transcript variants encoding the same protein have been found for this gene. | 7916 | proline rich coiled-coil 2A |
| ENSG00000109971 | HSPA8 | This gene encodes a member of the heat shock protein 70 family, which contains both heat-inducible and constitutively expressed members. This protein belongs to the latter group, which are also referred to as heat-shock cognate proteins. It functions as a chaperone, and binds to nascent polypeptides to facilitate correct folding. It also functions as an ATPase in the disassembly of clathrin-coated vesicles during transport of membrane components through the cell. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 3312 | heat shock protein family A (Hsp70) member 8 |
| ENSG00000021355 | SERPINB1 | The protein encoded by this gene is a member of the serpin family of proteinase inhibitors. Members of this family maintain homeostasis by neutralizing overexpressed proteinase activity through their function as suicide substrates. This protein inhibits the neutrophil-derived proteinases neutrophil elastase, cathepsin G, and proteinase-3 and thus protects tissues from damage at inflammatory sites. Alternative splicing results in multiple transcript variants. | 1992 | serpin family B member 1 |
| ENSG00000075151 | EIF4G3 | The protein encoded by this gene is thought to be part of the eIF4F protein complex, which is involved in mRNA cap recognition and transport of mRNAs to the ribosome. Interestingly, a microRNA (miR-520c-3p) has been found that negatively regulates synthesis of the encoded protein, and this leads to a global decrease in protein translation and cell proliferation. Therefore, this protein is a key component of the anti-tumor activity of miR-520c-3p. | 8672 | eukaryotic translation initiation factor 4 gamma 3 |
| ENSG00000115414 | FN1 | This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | 2335 | fibronectin 1 |
| ENSG00000067225 | PKM | This gene encodes a protein involved in glycolysis. The encoded protein is a pyruvate kinase that catalyzes the transfer of a phosphoryl group from phosphoenolpyruvate to ADP, generating ATP and pyruvate. This protein has been shown to interact with thyroid hormone and may mediate cellular metabolic effects induced by thyroid hormones. This protein has been found to bind Opa protein, a bacterial outer membrane protein involved in gonococcal adherence to and invasion of human cells, suggesting a role of this protein in bacterial pathogenesis. Several alternatively spliced transcript variants encoding a few distinct isoforms have been reported. | 5315 | pyruvate kinase, muscle |
| ENSG00000070371 | CLTCL1 | This gene is a member of the clathrin heavy chain family and encodes a major protein of the polyhedral coat of coated pits and vesicles. Chromosomal aberrations involving this gene are associated with meningioma, DiGeorge syndrome, and velo-cardio-facial syndrome. Multiple transcript variants encoding different isoforms have been found for this gene. | 8218 | clathrin heavy chain like 1 |
| ENSG00000134243 | SORT1 | This gene encodes a member of the VPS10-related sortilin family of proteins. The encoded preproprotein is proteolytically processed by furin to generate the mature receptor. This receptor plays a role in the trafficking of different proteins to either the cell surface, or subcellular compartments such as lysosomes and endosomes. Expression levels of this gene may influence the risk of myocardial infarction in human patients. Alternative splicing results in multiple transcript variants. | 6272 | sortilin 1 |
| ENSG00000082641 | NFE2L1 | This gene encodes a protein that is involved in globin gene expression in erythrocytes. Confusion has occurred in bibliographic databases due to the shared symbol of NRF1 for this gene, NFE2L1, and for ‘nuclear respiratory factor 1’ which has an official symbol of NRF1. | 4779 | nuclear factor, erythroid 2 like 1 |
| ENSG00000158828 | PINK1 | This gene encodes a serine/threonine protein kinase that localizes to mitochondria. It is thought to protect cells from stress-induced mitochondrial dysfunction. Mutations in this gene cause one form of autosomal recessive early-onset Parkinson disease. | 65018 | PTEN induced putative kinase 1 |
| ENSG00000188554 | NBR1 | The protein encoded by this gene was originally identified as an ovarian tumor antigen monitored in ovarian cancer. The encoded protein contains a B-box/coiled-coil motif, which is present in many genes with transformation potential. It functions as a specific autophagy receptor for the selective autophagic degradation of peroxisomes by forming intracellular inclusions with ubiquitylated autophagic substrates. This gene is located on a region of chromosome 17q21.1 that is in close proximity to the BRCA1 tumor suppressor gene. Alternative splicing of this gene results in multiple transcript variants. | 4077 | NBR1, autophagy cargo receptor |
| ENSG00000160299 | PCNT | The protein encoded by this gene binds to calmodulin and is expressed in the centrosome. It is an integral component of the pericentriolar material (PCM). The protein contains a series of coiled-coil domains and a highly conserved PCM targeting motif called the PACT domain near its C-terminus. The protein interacts with the microtubule nucleation component gamma-tubulin and is likely important to normal functioning of the centrosomes, cytoskeleton, and cell-cycle progression. Mutations in this gene cause Seckel syndrome-4 and microcephalic osteodysplastic primordial dwarfism type II. Two transcript variants encoding different isoforms have been found for this gene. | 5116 | pericentrin |
| ENSG00000142156 | COL6A1 | The collagens are a superfamily of proteins that play a role in maintaining the integrity of various tissues. Collagens are extracellular matrix proteins and have a triple-helical domain as their common structural element. Collagen VI is a major structural component of microfibrils. The basic structural unit of collagen VI is a heterotrimer of the alpha1(VI), alpha2(VI), and alpha3(VI) chains. The alpha2(VI) and alpha3(VI) chains are encoded by the COL6A2 and COL6A3 genes, respectively. The protein encoded by this gene is the alpha 1 subunit of type VI collagen (alpha1(VI) chain). Mutations in the genes that code for the collagen VI subunits result in the autosomal dominant disorder, Bethlem myopathy. | 1291 | collagen type VI alpha 1 |
| ENSG00000141753 | IGFBP4 | This gene is a member of the insulin-like growth factor binding protein (IGFBP) family and encodes a protein with an IGFBP domain and a thyroglobulin type-I domain. The protein binds both insulin-like growth factors (IGFs) I and II and circulates in the plasma in both glycosylated and non-glycosylated forms. Binding of this protein prolongs the half-life of the IGFs and alters their interaction with cell surface receptors. | 3487 | insulin like growth factor binding protein 4 |
| ENSG00000204592 | HLA-E | HLA-E belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. HLA-E binds a restricted subset of peptides derived from the leader peptides of other class I molecules. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon one encodes the leader peptide, exons 2 and 3 encode the alpha1 and alpha2 domains, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region, and exons 6 and 7 encode the cytoplasmic tail. | 3133 | major histocompatibility complex, class I, E |
| ENSG00000126803 | HSPA2 | NA | 3306 | heat shock protein family A (Hsp70) member 2 |
| ENSG00000204463 | BAG6 | This gene was first characterized as part of a cluster of genes located within the human major histocompatibility complex class III region. This gene encodes a nuclear protein that is cleaved by caspase 3 and is implicated in the control of apoptosis. In addition, the protein forms a complex with E1A binding protein p300 and is required for the acetylation of p53 in response to DNA damage. Multiple transcript variants encoding different isoforms have been found for this gene. | 7917 | BCL2 associated athanogene 6 |
| ENSG00000131095 | GFAP | This gene encodes one of the major intermediate filament proteins of mature astrocytes. It is used as a marker to distinguish astrocytes from other glial cells during development. Mutations in this gene cause Alexander disease, a rare disorder of astrocytes in the central nervous system. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | 2670 | glial fibrillary acidic protein |
| ENSG00000115758 | ODC1 | This gene encodes the rate-limiting enzyme of the polyamine biosynthesis pathway which catalyzes ornithine to putrescine. The activity level for the enzyme varies in response to growth-promoting stimuli and exhibits a high turnover rate in comparison to other mammalian proteins. Originally localized to both chromosomes 2 and 7, the gene encoding this enzyme has been determined to be located on 2p25, with a pseudogene located on 7q31-qter. Multiple alternatively spliced transcript variants encoding distinct isoforms have been identified. | 4953 | ornithine decarboxylase 1 |
| ENSG00000103994 | ZNF106 | NA | 64397 | zinc finger protein 106 |
| ENSG00000089737 | DDX24 | DEAD box proteins, characterized by the conserved motif Asp-Glu-Ala-Asp (DEAD), are putative RNA helicases. They are implicated in a number of cellular processes involving alteration of RNA secondary structure such as translation initiation, nuclear and mitochondrial splicing, and ribosome and spliceosome assembly. Based on their distribution patterns, some members of this family are believed to be involved in embryogenesis, spermatogenesis, and cellular growth and division. This gene encodes a DEAD box protein, which shows little similarity to any of the other known human DEAD box proteins, but shows a high similarity to mouse Ddx24 at the amino acid level. | 57062 | DEAD-box helicase 24 |
| ENSG00000018625 | ATP1A2 | The protein encoded by this gene belongs to the family of P-type cation transport ATPases, and to the subfamily of Na+/K+ -ATPases. Na+/K+ -ATPase is an integral membrane protein responsible for establishing and maintaining the electrochemical gradients of Na and K ions across the plasma membrane. These gradients are essential for osmoregulation, for sodium-coupled transport of a variety of organic and inorganic molecules, and for electrical excitability of nerve and muscle. This enzyme is composed of two subunits, a large catalytic subunit (alpha) and a smaller glycoprotein subunit (beta). The catalytic subunit of Na+/K+ -ATPase is encoded by multiple genes. This gene encodes an alpha 2 subunit. Mutations in this gene result in familial basilar or hemiplegic migraines, and in a rare syndrome known as alternating hemiplegia of childhood. | 477 | ATPase Na+/K+ transporting subunit alpha 2 |
| ENSG00000234964 | FABP5P7 | NA | ENSG00000234964 | fatty acid binding protein 5 pseudogene 7 |
| ENSG00000143248 | RGS5 | This gene encodes a member of the regulators of G protein signaling (RGS) family. The RGS proteins are signal transduction molecules which are involved in the regulation of heterotrimeric G proteins by acting as GTPase activators. This gene is a hypoxia-inducible factor-1 dependent, hypoxia-induced gene which is involved in the induction of endothelial apoptosis. This gene is also one of three genes on chromosome 1q contributing to elevated blood pressure. Alternatively spliced transcript variants have been identified. | 8490 | regulator of G-protein signaling 5 |
| ENSG00000128016 | ZFP36 | NA | 7538 | ZFP36 ring finger protein |
| ENSG00000127481 | UBR4 | The protein encoded by this gene is an E3 ubiquitin-protein ligase that interacts with the retinoblastoma-associated protein in the nucleus and with calcium-bound calmodulin in the cytoplasm. The encoded protein appears to be a cytoskeletal component in the cytoplasm and part of the chromatin scaffold in the nucleus. In addition, this protein is a target of the human papillomavirus type 16 E7 oncoprotein. | 23352 | ubiquitin protein ligase E3 component n-recognin 4 |
| ENSG00000115677 | HDLBP | The protein encoded by this gene binds high density lipoprotein (HDL) and may function to regulate excess cholesterol levels in cells. The encoded protein also binds RNA and can induce heterochromatin formation. | 3069 | high density lipoprotein binding protein |
| ENSG00000078618 | NRDC | This gene encodes a zinc-dependent endopeptidase that cleaves peptide substrates at the N-terminus of arginine residues in dibasic moieties and is a member of the peptidase M16 family. This protein interacts with heparin-binding EGF-like growth factor and plays a role in cell migration and proliferation. Multiple transcript variants encoding different isoforms have been found for this gene. | 4898 | nardilysin convertase |
| ENSG00000026508 | CD44 | The protein encoded by this gene is a cell-surface glycoprotein involved in cell-cell interactions, cell adhesion and migration. It is a receptor for hyaluronic acid (HA) and can also interact with other ligands, such as osteopontin, collagens, and matrix metalloproteinases (MMPs). This protein participates in a wide variety of cellular functions including lymphocyte activation, recirculation and homing, hematopoiesis, and tumor metastasis. Transcripts for this gene undergo complex alternative splicing that results in many functionally distinct isoforms, however, the full length nature of some of these variants has not been determined. Alternative splicing is the basis for the structural and functional diversity of this protein, and may be related to tumor metastasis. | 960 | CD44 molecule (Indian blood group) |
| ENSG00000105655 | ISYNA1 | This gene encodes an inositol-3-phosphate synthase enzyme. The encoded protein plays a critical role in the myo-inositol biosynthesis pathway by catalyzing the rate-limiting conversion of glucose 6-phosphate to myoinositol 1-phosphate. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene, and a pseudogene of this gene is located on the short arm of chromosome 4. | 51477 | inositol-3-phosphate synthase 1 |
| ENSG00000169710 | FASN | The enzyme encoded by this gene is a multifunctional protein. Its main function is to catalyze the synthesis of palmitate from acetyl-CoA and malonyl-CoA, in the presence of NADPH, into long-chain saturated fatty acids. In some cancer cell lines, this protein has been found to be fused with estrogen receptor-alpha (ER-alpha), in which the N-terminus of FAS is fused in-frame with the C-terminus of ER-alpha. | 2194 | fatty acid synthase |
| ENSG00000110713 | NUP98 | Nuclear pore complexes (NPCs) regulate the transport of macromolecules between the nucleus and cytoplasm, and are composed of many polypeptide subunits, many of which belong to the nucleoporin family. This gene belongs to the nucleoporin gene family and encodes a 186 kDa precursor protein that undergoes autoproteolytic cleavage to generate a 98 kDa nucleoporin and 96 kDa nucleoporin. The 98 kDa nucleoporin contains a Gly-Leu-Phe-Gly (GLGF) repeat domain and participates in many cellular processes, including nuclear import, nuclear export, mitotic progression, and regulation of gene expression. The 96 kDa nucleoporin is a scaffold component of the NPC. Proteolytic cleavage is important for targeting of the proteins to the NPC. Translocations between this gene and many other partner genes have been observed in different leukemias. Rearrangements typically result in chimeras with the N-terminal GLGF domain of this gene to the C-terminus of the partner gene. Alternative splicing results in multiple transcript variants encoding different isoforms, at least two of which are proteolytically processed. Some variants lack the region that encodes the 96 kDa nucleoporin. | 4928 | nucleoporin 98 |
| ENSG00000188643 | S100A16 | NA | 140576 | S100 calcium binding protein A16 |
| ENSG00000197747 | S100A10 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in exocytosis and endocytosis. | 6281 | S100 calcium binding protein A10 |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",1,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[2,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| summary | X_id | symbol | name | query |
|---|---|---|---|---|
| Protamines substitute for histones in the chromatin of sperm during the haploid phase of spermatogenesis, and are the major DNA-binding proteins in the nucleus of sperm in many vertebrates. They package the sperm DNA into a highly condensed complex in a volume less than 5% of a somatic cell nucleus. Many mammalian species have only one protamine (protamine 1); however, a few species, including human and mouse, have two. This gene encodes protamine 2, which is cleaved to give rise to a family of protamine 2 peptides. Alternatively spliced transcript variants have also been found for this gene. | 5620 | PRM2 | protamine 2 | ENSG00000122304 |
| NA | 5619 | PRM1 | protamine 1 | ENSG00000175646 |
| This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | 2335 | FN1 | fibronectin 1 | ENSG00000115414 |
| Spermatogenesis is a complex process regulated by extracellular and intracellular factors as well as cellular interactions among interstitial cells of the testis, Sertoli cells, and germ cells. This gene is expressed in the testis in Sertoli cells but not germ cells. The protein encoded by this gene contains plant homeodomain (PHD) finger domains, also known as leukemia associated protein (LAP) domains, believed to be involved in transcriptional regulation. The protein, which localizes to the nucleus of transfected cells, has been implicated in the transcriptional regulation of spermatogenesis. Alternate splicing results in multiple transcript variants of this gene. | 51533 | PHF7 | PHD finger protein 7 | ENSG00000010318 |
| The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | 3860 | KRT13 | keratin 13 | ENSG00000171401 |
| This gene encodes a member of the globin superfamily and is expressed in skeletal and cardiac muscles. The encoded protein is a haemoprotein contributing to intracellular oxygen storage and transcellular facilitated diffusion of oxygen. At least three alternatively spliced transcript variants encoding the same protein have been reported. | 4151 | MB | myoglobin | ENSG00000198125 |
| This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | 1674 | DES | desmin | ENSG00000175084 |
| Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | 4625 | MYH7 | myosin, heavy chain 7, cardiac muscle, beta | ENSG00000092054 |
| This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | 7273 | TTN | titin | ENSG00000155657 |
| This gene encodes one of the major intermediate filament proteins of mature astrocytes. It is used as a marker to distinguish astrocytes from other glial cells during development. Mutations in this gene cause Alexander disease, a rare disorder of astrocytes in the central nervous system. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | 2670 | GFAP | glial fibrillary acidic protein | ENSG00000131095 |
| The outer dense fibers are cytoskeletal structures that surround the axoneme in the middle piece and principal piece of the sperm tail. The fibers function in maintaining the elastic structure and recoil of the sperm tail as well as in protecting the tail from shear forces during epididymal transport and ejaculation. Defects in the outer dense fibers lead to abnormal sperm morphology and infertility. This gene encodes one of the major outer dense fiber proteins. Alternative splicing results in multiple transcript variants. The longer transcripts, also known as ‘Cenexins’, encode proteins with a C-terminal extension that are differentially targeted to somatic centrioles and thought to be crucial for the formation of microtubule organizing centers. | 4957 | ODF2 | outer dense fiber of sperm tails 2 | ENSG00000136811 |
| NA | 81691 | LOC81691 | exonuclease NEF-sp | ENSG00000005189 |
| This gene encodes a lysine-specific histone demethylase that belongs to the jumonji/ARID domain-containing family of histone demethylases. The encoded protein is capable of demethylating tri-, di- and monomethylated lysine 4 of histone H3. This protein plays a role in the transcriptional repression or certain tumor suppressor genes and is upregulated in certain cancer cells. This protein may also play a role in genome stability and DNA repair. Alternate splicing resultsi n multiple transcript variants. | 10765 | KDM5B | lysine demethylase 5B | ENSG00000117139 |
| This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | 60 | ACTB | actin, beta | ENSG00000075624 |
| This gene belongs to the ATP-ases associated with diverse cellular activities (AAA+) superfamily. Members of this superfamily form ring-shaped homo-hexamers and have highly conserved ATPase domains that are involved in various processes including DNA replication, protein degradation and reactivation of misfolded proteins. All members of this family hydrolyze ATP through their AAA+ domains and use the energy generated through ATP hydrolysis to exert mechanical force on their substrates. In addition to an AAA+ domain, the protein encoded by this gene contains a C-terminal D2 domain, which is characteristic of the AAA+ subfamily of Caseinolytic peptidases to which this protein belongs. It cooperates with Hsp70 in the disaggregation of protein aggregates. Allelic variants of this gene are associated with 3-methylglutaconic aciduria, which causes cataracts and neutropenia. Alternative splicing results in multiple transcript variants. | 81570 | CLPB | ClpB homolog, mitochondrial AAA ATPase chaperonin | ENSG00000162129 |
| This gene encodes one of three related filamin genes, specifically gamma filamin. These filamin proteins crosslink actin filaments into orthogonal networks in cortical cytoplasm and participate in the anchoring of membrane proteins for the actin cytoskeleton. Three functional domains exist in filamin: an N-terminal filamentous actin-binding domain, a C-terminal self-association domain, and a membrane glycoprotein-binding domain. Two transcript variants encoding different isoforms have been found for this gene. | 2318 | FLNC | filamin C | ENSG00000128591 |
| The protein encoded by this gene interacts with components of the origin recognition complex (ORC) and regulates the formation of the prereplicative complex. The encoded protein stabilizes the ORC and therefore aids in DNA replication. This protein is required for the G1/S phase transition of the cell cycle. In addition, the encoded protein binds to trimethylated histone H3 in heterochromatin and recruits the ORC and lysine methyltransferases, which help maintain the repressive heterochromatic state. Two transcript variants encoding different isoforms have been found for this gene. | 222229 | LRWD1 | leucine rich repeats and WD repeat domain containing 1 | ENSG00000161036 |
| This gene encodes nebulin, a giant protein component of the cytoskeletal matrix that coexists with the thick and thin filaments within the sarcomeres of skeletal muscle. In most vertebrates, nebulin accounts for 3 to 4% of the total myofibrillar protein. The encoded protein contains approximately 30-amino acid long modules that can be classified into 7 types and other repeated modules. Protein isoform sizes vary from 600 to 800 kD due to alternative splicing that is tissue-, species-,and developmental stage-specific. Of the 183 exons in the nebulin gene, at least 43 are alternatively spliced, although exons 143 and 144 are not found in the same transcript. Of the several thousand transcript variants predicted for nebulin, the RefSeq Project has decided to create three representative RefSeq records. Mutations in this gene are associated with recessive nemaline myopathy. | 4703 | NEB | nebulin | ENSG00000183091 |
| NA | 128229 | TSACC | TSSK6 activating cochaperone | ENSG00000163467 |
| The protein encoded by this gene belongs to the natriuretic peptide family. Natriuretic peptides are implicated in the control of extracellular fluid volume and electrolyte homeostasis. This protein is synthesized as a large precursor (containing a signal peptide), which is processed to release a peptide from the N-terminus with similarity to vasoactive peptide, cardiodilatin, and another peptide from the C-terminus with natriuretic-diuretic activity. Mutations in this gene have been associated with atrial fibrillation familial type 6. This gene is located adjacent to another member of the natriuretic family of peptides on chromosome 1. | 4878 | NPPA | natriuretic peptide A | ENSG00000175206 |
| The protein encoded by this gene belongs to the glutamine synthetase family. It catalyzes the synthesis of glutamine from glutamate and ammonia in an ATP-dependent reaction. This protein plays a role in ammonia and glutamate detoxification, acid-base homeostasis, cell signaling, and cell proliferation. Glutamine is an abundant amino acid, and is important to the biosynthesis of several amino acids, pyrimidines, and purines. Mutations in this gene are associated with congenital glutamine deficiency, and overexpression of this gene was observed in some primary liver cancer samples. There are six pseudogenes of this gene found on chromosomes 2, 5, 9, 11, and 12. Alternative splicing results in multiple transcript variants. | 2752 | GLUL | glutamate-ammonia ligase | ENSG00000135821 |
| This gene product belongs to the glutathione peroxidase family, which functions in the detoxification of hydrogen peroxide. It contains a selenocysteine (Sec) residue at its active site. The selenocysteine is encoded by the UGA codon, which normally signals translation termination. The 3’ UTR of Sec-containing genes have a common stem-loop structure, the sec insertion sequence (SECIS), which is necessary for the recognition of UGA as a Sec codon rather than as a stop signal. | 2878 | GPX3 | glutathione peroxidase 3 | ENSG00000211445 |
| HLA-B belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. Class I molecules play a central role in the immune system by presenting peptides derived from the endoplasmic reticulum lumen. They are expressed in nearly all cells. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon 1 encodes the leader peptide, exon 2 and 3 encode the alpha1 and alpha2 domains, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region and exons 6 and 7 encode the cytoplasmic tail. Polymorphisms within exon 2 and exon 3 are responsible for the peptide binding specificity of each class one molecule. Typing for these polymorphisms is routinely done for bone marrow and kidney transplantation. Hundreds of HLA-B alleles have been described. | 3106 | HLA-B | major histocompatibility complex, class I, B | ENSG00000234745 |
| The protein encoded by this gene is similar to proacrosin binding protein sp32 precursor found in mouse, guinea pig, and pig. This protein is located in the sperm acrosome and is thought to function as a binding protein to proacrosin for packaging and condensation of the acrosin zymogen in the acrosomal matrix. This protein is a member of the cancer/testis family of antigens and it is found to be immunogenic. In normal tissues, this mRNA is expressed only in testis, whereas it is detected in a range of different tumor types such as bladder, breast, lung, liver, and colon. | 84519 | ACRBP | acrosin binding protein | ENSG00000111644 |
| NA | ENSG00000229732 | AC019349.5 | NA | ENSG00000229732 |
| This gene encodes the largest subunit of RNA polymerase II, the polymerase responsible for synthesizing messenger RNA in eukaryotes. The product of this gene contains a carboxy terminal domain composed of heptapeptide repeats that are essential for polymerase activity. These repeats contain serine and threonine residues that are phosphorylated in actively transcribing RNA polymerase. In addition, this subunit, in combination with several other polymerase subunits, forms the DNA binding domain of the polymerase, a groove in which the DNA template is transcribed into RNA. | 5430 | POLR2A | polymerase (RNA) II subunit A | ENSG00000181222 |
| The protein encoded by this gene is a member of the PP2C family of Ser/Thr protein phosphatases. PP2C family members are known to be negative regulators of cell stress response pathways. This phosphatase is found to be responsible for the dephosphorylation of Pre-mRNA splicing factors, which is important for the formation of functional spliceosome. Studies of a similar gene in mice suggested a role of this phosphatase in regulating cell cycle progression. | 5496 | PPM1G | protein phosphatase, Mg2+/Mn2+ dependent 1G | ENSG00000115241 |
| This gene is proposed to play a role in cerebral cortical development. Mutations in this gene have been associated with microencephaly, cortical malformations, and mental retardation. Alternative splicing results in multiple transcript variants. | 284403 | WDR62 | WD repeat domain 62 | ENSG00000075702 |
| This gene encodes a protein belonging to the glyceraldehyde-3-phosphate dehydrogenase family of enzymes that play an important role in carbohydrate metabolism. Like its somatic cell counterpart, this sperm-specific enzyme functions in a nicotinamide adenine dinucleotide-dependent manner to remove hydrogen and add phosphate to glyceraldehyde 3-phosphate to form 1,3-diphosphoglycerate. During spermiogenesis, this enzyme may play an important role in regulating the switch between different energy-producing pathways, and it is required for sperm motility and male fertility. | 26330 | GAPDHS | glyceraldehyde-3-phosphate dehydrogenase, spermatogenic | ENSG00000105679 |
| NA | ENSG00000219435 | TEX40 | testis expressed 40 | ENSG00000219435 |
| NA | 6707 | SPRR3 | small proline rich protein 3 | ENSG00000163209 |
| NA | 64753 | CCDC136 | coiled-coil domain containing 136 | ENSG00000128596 |
| The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. Mutations in this gene have been associated with familial hypertrophic cardiomyopathy as well as with dilated cardiomyopathy. Transcripts for this gene undergo alternative splicing that results in many tissue-specific isoforms, however, the full-length nature of some of these variants has not yet been determined. | 7139 | TNNT2 | troponin T2, cardiac type | ENSG00000118194 |
| The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and altered expression of this protein is associated with the disease cystic fibrosis. This antimicrobial protein exhibits antifungal and antibacterial activity. | 6280 | S100A9 | S100 calcium binding protein A9 | ENSG00000163220 |
| NA | 100129518 | LOC100129518 | uncharacterized LOC100129518 | ENSG00000112096 |
| This gene is a member of the iron/manganese superoxide dismutase family. It encodes a mitochondrial protein that forms a homotetramer and binds one manganese ion per subunit. This protein binds to the superoxide byproducts of oxidative phosphorylation and converts them to hydrogen peroxide and diatomic oxygen. Mutations in this gene have been associated with idiopathic cardiomyopathy (IDC), premature aging, sporadic motor neuron disease, and cancer. Alternative splicing of this gene results in multiple transcript variants. A related pseudogene has been identified on chromosome 1. | 6648 | SOD2 | superoxide dismutase 2, mitochondrial | ENSG00000112096 |
| This gene encodes a highly conserved cold shock domain protein that has broad nucleic acid binding properties. The encoded protein functions as both a DNA and RNA binding protein and has been implicated in numerous cellular processes including regulation of transcription and translation, pre-mRNA splicing, DNA reparation and mRNA packaging. This protein is also a component of messenger ribonucleoprotein (mRNP) complexes and may have a role in microRNA processing. This protein can be secreted through non-classical pathways and functions as an extracellular mitogen. Aberrant expression of the gene is associated with cancer proliferation in numerous tissues. This gene may be a prognostic marker for poor outcome and drug resistance in certain cancers. Alternate splicing results in multiple transcript variants. Pseudogenes of this gene are found on multiple chromosomes. | 4904 | YBX1 | Y-box binding protein 1 | ENSG00000065978 |
| This gene encodes a member of the DnaJ or Hsp40 (heat shock protein 40 kD) family of proteins. DNAJ family members are characterized by a highly conserved amino acid stretch called the ‘J-domain’ and function as one of the two major classes of molecular chaperones involved in a wide range of cellular events, such as protein folding and oligomeric protein complex assembly. The encoded protein is a molecular chaperone that stimulates the ATPase activity of Hsp70 heat-shock proteins in order to promote protein folding and prevent misfolded protein aggregation. Alternative splicing results in multiple transcript variants. | 3337 | DNAJB1 | DnaJ heat shock protein family (Hsp40) member B1 | ENSG00000132002 |
| The protein encoded by this gene belongs to the superfamily of small heat-shock proteins containing a conservative alpha-crystallin domain at the C-terminal part of the molecule. The expression of this gene in induced by estrogen in estrogen receptor-positive breast cancer cells, and this protein also functions as a chaperone in association with Bag3, a stimulator of macroautophagy. Thus, this gene appears to be involved in regulation of cell proliferation, apoptosis, and carcinogenesis, and mutations in this gene have been associated with different neuromuscular diseases, including Charcot-Marie-Tooth disease. | 26353 | HSPB8 | heat shock protein family B (small) member 8 | ENSG00000152137 |
| This gene encodes an inositol-3-phosphate synthase enzyme. The encoded protein plays a critical role in the myo-inositol biosynthesis pathway by catalyzing the rate-limiting conversion of glucose 6-phosphate to myoinositol 1-phosphate. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene, and a pseudogene of this gene is located on the short arm of chromosome 4. | 51477 | ISYNA1 | inositol-3-phosphate synthase 1 | ENSG00000105655 |
| This gene encodes a protein containing a MYND-type zinc finger domain that likely functions in assembly of the dynein motor. Mutations in this gene can cause primary ciliary dyskinesia. This gene is also considered a tumor suppressor gene and is often mutated, deleted, or hypermethylated and silenced in cancer cells. Alternative splicing results in multiple transcript variants. | 51364 | ZMYND10 | zinc finger MYND-type containing 10 | ENSG00000004838 |
| This gene encodes a kinesin-like protein that functions as a microtubule-dependent molecular motor. The encoded protein can depolymerize microtubules at the plus end, thereby promoting mitotic chromosome segregation. Alternative splicing results in multiple transcript variants. | 11004 | KIF2C | kinesin family member 2C | ENSG00000142945 |
| This gene encodes a member of the F-box protein family, members of which are characterized by an approximately 40 amino acid motif, the F-box. The F-box proteins constitute one of the four subunits of ubiquitin protein ligase complex called SCFs (SKP1-cullin-F-box), which function in phosphorylation-dependent ubiquitination. The F-box proteins are divided into three classes: Fbws containing WD-40 domains, Fbls containing leucine-rich repeats, and Fbxs containing either different protein-protein interaction modules or no recognizable motifs. The protein encoded by this gene contains WD-40 domains, in addition to an F-box motif, so it belongs to the Fbw class. Alternatively spliced transcript variants encoding distinct isoforms have been identified for this gene, however, they were found to be nonsense-mediated mRNA decay (NMD) candidates, hence not represented. | 54461 | FBXW5 | F-box and WD repeat domain containing 5 | ENSG00000159069 |
| This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 7169 | TPM2 | tropomyosin 2 (beta) | ENSG00000198467 |
| NA | 8404 | SPARCL1 | SPARC like 1 | ENSG00000152583 |
| This gene encodes a major cytoplasmic protein which is the only known constituent common to submembranous plaques of both desmosomes and intermediate junctions. This protein forms distinct complexes with cadherins and desmosomal cadherins and is a member of the catenin family since it contains a distinct repeating amino acid motif called the armadillo repeat. Mutation in this gene has been associated with Naxos disease. Alternative splicing occurs in this gene; however, not all transcripts have been fully described. | 3728 | JUP | junction plakoglobin | ENSG00000173801 |
| This gene encodes a member of the alpha tubulin family. Tubulin is a major component of microtubules, which are composed of alpha- and beta-tubulin heterodimers and microtubule-associated proteins in the cytoskeleton. Microtubules maintain cellular structure, function in intracellular transport, and play a role in spindle formation during mitosis. | 113457 | TUBA3D | tubulin alpha 3d | ENSG00000075886 |
| The product encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Mutations in this gene cause nemaline myopathy type 3, congenital myopathy with excess of thin myofilaments, congenital myopathy with cores, and congenital myopathy with fiber-type disproportion, diseases that lead to muscle fiber defects. | 58 | ACTA1 | actin, alpha 1, skeletal muscle | ENSG00000143632 |
| This gene encodes a member of the M14 family of metallocarboxypeptidases. The encoded preproprotein is proteolytically processed to generate the mature peptidase. This peripheral membrane protein cleaves C-terminal amino acid residues and is involved in the biosynthesis of peptide hormones and neurotransmitters, including insulin. This protein may also function independently of its peptidase activity, as a neurotrophic factor that promotes neuronal survival, and as a sorting receptor that binds to regulated secretory pathway proteins, including prohormones. Mutations in this gene are implicated in type 2 diabetes. | 1363 | CPE | carboxypeptidase E | ENSG00000109472 |
| NA | 84266 | ALKBH7 | alkB homolog 7 | ENSG00000125652 |
| This gene encodes a component of high density lipoprotein that has no marked similarity to other apolipoprotein sequences. It has a high degree of homology to plasma retinol-binding protein and other members of the alpha 2 microglobulin protein superfamily of carrier proteins, also known as lipocalins. This glycoprotein is closely associated with the enzyme lecithin:cholesterol acyltransferase - an enzyme involved in lipoprotein metabolism. | 347 | APOD | apolipoprotein D | ENSG00000189058 |
| Nuclear pore complexes (NPCs) regulate the transport of macromolecules between the nucleus and cytoplasm, and are composed of many polypeptide subunits, many of which belong to the nucleoporin family. This gene belongs to the nucleoporin gene family and encodes a 186 kDa precursor protein that undergoes autoproteolytic cleavage to generate a 98 kDa nucleoporin and 96 kDa nucleoporin. The 98 kDa nucleoporin contains a Gly-Leu-Phe-Gly (GLGF) repeat domain and participates in many cellular processes, including nuclear import, nuclear export, mitotic progression, and regulation of gene expression. The 96 kDa nucleoporin is a scaffold component of the NPC. Proteolytic cleavage is important for targeting of the proteins to the NPC. Translocations between this gene and many other partner genes have been observed in different leukemias. Rearrangements typically result in chimeras with the N-terminal GLGF domain of this gene to the C-terminus of the partner gene. Alternative splicing results in multiple transcript variants encoding different isoforms, at least two of which are proteolytically processed. Some variants lack the region that encodes the 96 kDa nucleoporin. | 4928 | NUP98 | nucleoporin 98 | ENSG00000110713 |
| This gene was first characterized as part of a cluster of genes located within the human major histocompatibility complex class III region. This gene encodes a nuclear protein that is cleaved by caspase 3 and is implicated in the control of apoptosis. In addition, the protein forms a complex with E1A binding protein p300 and is required for the acetylation of p53 in response to DNA damage. Multiple transcript variants encoding different isoforms have been found for this gene. | 7917 | BAG6 | BCL2 associated athanogene 6 | ENSG00000204463 |
| This gene encodes a member of the myosin-binding protein C family. Myosin-binding protein C family members are myosin-associated proteins found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The encoded protein is the slow skeletal muscle isoform of myosin-binding protein C and plays an important role in muscle contraction by recruiting muscle-type creatine kinase to myosin filaments. Mutations in this gene are associated with distal arthrogryposis type I. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 4604 | MYBPC1 | myosin binding protein C, slow type | ENSG00000196091 |
| This gene belongs to the chemokine-like factor gene superfamily, a novel family that links the chemokine and the transmembrane 4 superfamilies of signaling molecules. The protein encoded by this gene may play an important role in testicular development. | 146225 | CMTM2 | CKLF like MARVEL transmembrane domain containing 2 | ENSG00000140932 |
| NA | ENSG00000211896 | IGHG1 | immunoglobulin heavy constant gamma 1 (G1m marker) | ENSG00000211896 |
| This gene encodes a member of A-kinase anchoring proteins (AKAPs), a family of functionally related proteins that target protein kinase A to discrete locations within the cell. The encoded protein is reported to participate in protein-protein interactions with the R-subunit of the protein kinase A as well as sperm-associated proteins. This protein is expressed in spermatozoa and localized to the acrosomal region of the sperm head as well as the length of the principal piece. It may function as a regulator of motility, capacitation, and the acrosome reaction. | 10566 | AKAP3 | A-kinase anchoring protein 3 | ENSG00000111254 |
| The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. As many as six of this type II cytokeratin (KRT6) have been identified; the multiplicity of the genes is attributed to successive gene duplication events. The genes are expressed with family members KRT16 and/or KRT17 in the filiform papillae of the tongue, the stratified epithelial lining of oral mucosa and esophagus, the outer root sheath of hair follicles, and the glandular epithelia. This KRT6 gene in particular encodes the most abundant isoform. Mutations in these genes have been associated with pachyonychia congenita. In addition, peptides from the C-terminal region of the protein have antimicrobial activity against bacterial pathogens. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3853 | KRT6A | keratin 6A | ENSG00000205420 |
| NA | ENSG00000242349 | NPPA-AS1 | NPPA antisense RNA 1 | ENSG00000242349 |
| This gene encodes a member of the glutathione peroxidase protein family. Glutathione peroxidase catalyzes the reduction of hydrogen peroxide, organic hydroperoxide, and lipid peroxides by reduced glutathione and functions in the protection of cells against oxidative damage. Human plasma glutathione peroxidase has been shown to be a selenium-containing enzyme and the UGA codon is translated into a selenocysteine. The encoded protein has been identified as a moonlighting protein based on its ability to serve dual functions as a peroxidase as well as a structural protein in mature spermatozoa. Through alternative splicing and transcription initiation, rat produces proteins that localize to the nucleus, mitochondrion, and cytoplasm. In humans, alternative transcription initiation and the cleavage sites of the mitochondrial and nuclear transit peptides need to be experimentally verified. Alternative splicing results in multiple transcript variants. | 2879 | GPX4 | glutathione peroxidase 4 | ENSG00000167468 |
| This gene belongs to a highly conserved gene family encoding EPS15 homology (EH) domain-containing proteins. The protein-binding EH domain was first noted in EPS15, a substrate for the epidermal growth factor receptor. The EH domain has been shown to be an important motif in proteins involved in protein-protein interactions and in intracellular sorting. The protein encoded by this gene is thought to play a role in the endocytosis of IGF1 receptors. Alternatively spliced transcript variants have been found for this gene. | 10938 | EHD1 | EH domain containing 1 | ENSG00000110047 |
| To reach fertilization competence, spermatozoa undergo a series of morphological and molecular maturational processes, termed capacitation, involving protein tyrosine phosphorylation and increased intracellular calcium. The protein encoded by this gene localizes to the principal piece of the sperm flagellum in association with the fibrous sheath and exhibits calcium-binding when phosphorylated during capacitation. A pseudogene on chromosome 3 has been identified for this gene. Alternatively spliced transcript variants encoding distinct protein isoforms have been found for this gene. | 26256 | CABYR | calcium binding tyrosine phosphorylation regulated | ENSG00000154040 |
| Histones are basic nuclear proteins that are responsible for the nucleosome structure of the chromosomal fiber in eukaryotes. Nucleosomes consist of approximately 146 bp of DNA wrapped around a histone octamer composed of pairs of each of the four core histones (H2A, H2B, H3, and H4). The chromatin fiber is further compacted through the interaction of a linker histone, H1, with the DNA between the nucleosomes to form higher order chromatin structures. This gene is located on chromosome 12 and encodes a replication-independent histone that is a variant H2A histone. The protein is divergent at the C-terminus compared to the consensus H2A histone family member. This gene also encodes an antimicrobial peptide with antibacterial and antifungal activity. | 55766 | H2AFJ | H2A histone family member J | ENSG00000246705 |
| The protein encoded by this gene is localized to the nucleus of endothelial cells and is induced by IL-1 and TNF-alpha stimulation. Studies in rat cardiomyocytes suggest that this gene functions as a transcription factor. Interactions between this protein and the sarcomeric proteins myopalladin and titin suggest that it may also be involved in the myofibrillar stretch-sensor system. | 27063 | ANKRD1 | ankyrin repeat domain 1 | ENSG00000148677 |
| NA | 147011 | PROCA1 | protein interacting with cyclin A1 | ENSG00000167525 |
| This gene encodes the second human homologue of the bacterial RuvB gene. Bacterial RuvB protein is a DNA helicase essential for homologous recombination and DNA double-strand break repair. Functional analysis showed that this gene product has both ATPase and DNA helicase activities. This gene is physically linked to the CGB/LHB gene cluster on chromosome 19q13.3, and is very close (55 nt) to the LHB gene, in the opposite orientation. | 10856 | RUVBL2 | RuvB like AAA ATPase 2 | ENSG00000183207 |
| This gene encodes a muscle enzyme involved in glycogenolysis. Highly similar enzymes encoded by different genes are found in liver and brain. Mutations in this gene are associated with McArdle disease (myophosphorylase deficiency), a glycogen storage disease of muscle. Alternative splicing results in multiple transcript variants. | 5837 | PYGM | phosphorylase, glycogen, muscle | ENSG00000068976 |
| This gene encodes the pro-alpha1 chains of type III collagen, a fibrillar collagen that is found in extensible connective tissues such as skin, lung, uterus, intestine and the vascular system, frequently in association with type I collagen. Mutations in this gene are associated with Ehlers-Danlos syndrome types IV, and with aortic and arterial aneurysms. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | 1281 | COL3A1 | collagen type III alpha 1 chain | ENSG00000168542 |
| The import of proteins into the nucleus is a process that involves at least 2 steps. The first is an energy-independent docking of the protein to the nuclear envelope and the second is an energy-dependent translocation through the nuclear pore complex. Imported proteins require a nuclear localization sequence (NLS) which generally consists of a short region of basic amino acids or 2 such regions spaced about 10 amino acids apart. Proteins involved in the first step of nuclear import have been identified in different systems. These include the Xenopus protein importin and its yeast homolog, SRP1 (a suppressor of certain temperature-sensitive mutations of RNA polymerase I in Saccharomyces cerevisiae), which bind to the NLS. KPNA2 protein interacts with the NLSs of DNA helicase Q1 and SV40 T antigen and may be involved in the nuclear transport of proteins. KPNA2 also may play a role in V(D)J recombination. Alternative splicing results in multiple transcript variants. | 3838 | KPNA2 | karyopherin subunit alpha 2 | ENSG00000182481 |
| This gene was identified as a gene whose expression can be induced by the tumor necrosis factor alpha (TNF) in umbilical vein endothelial cells. The expression of this gene was shown to be induced by retinoic acid in a cell line expressing a oncogenic version of the retinoic acid receptor alpha fusion protein, which suggested that this gene may be a retinoic acid target gene in acute promyelocytic leukemia. | 7127 | TNFAIP2 | TNF alpha induced protein 2 | ENSG00000185215 |
| NA | ENSG00000153363 | LINC00467 | long intergenic non-protein coding RNA 467 | ENSG00000153363 |
| The nuclear pore complex is a massive structure that extends across the nuclear envelope, forming a gateway that regulates the flow of macromolecules between the nucleus and the cytoplasm. Nucleoporins are the main components of the nuclear pore complex in eukaryotic cells. This gene is a member of the FG-repeat-containing nucleoporins. The protein encoded by this gene is localized to the cytoplasmic face of the nuclear pore complex where it is required for proper cell cycle progression and nucleocytoplasmic transport. The 3’ portion of this gene forms a fusion gene with the DEK gene on chromosome 6 in a t(6,9) translocation associated with acute myeloid leukemia and myelodysplastic syndrome. Alternative splicing of this gene results in multiple transcript variants encoding different isoforms. | 8021 | NUP214 | nucleoporin 214 | ENSG00000126883 |
| The protein encoded by this gene mediates transcriptional control by interaction with the Kruppel-associated box repression domain found in many transcription factors. The protein localizes to the nucleus and is thought to associate with specific chromatin regions. The protein is a member of the tripartite motif family. This tripartite motif includes three zinc-binding domains, a RING, a B-box type 1 and a B-box type 2, and a coiled-coil region. | 10155 | TRIM28 | tripartite motif containing 28 | ENSG00000130726 |
| Fibrosin is a lymphokine secreted by activated lymphocytes that induces fibroblast proliferation (Prakash and Robbins, 1998 [PubMed 9809749]). | 64319 | FBRS | fibrosin | ENSG00000156860 |
| NA | 51155 | HN1 | hematological and neurological expressed 1 | ENSG00000189159 |
| NA | 51458 | RHCG | Rh family C glycoprotein | ENSG00000140519 |
| The nuclear pore complex (NPC) is found on the nuclear envelope and forms a gateway that regulates the flow of proteins and RNAs between the cytoplasm and nucleoplasm. The NPC is comprised of approximately 30 distinct proteins collectively known as nucleoporins. Nucleoporins are pore-complex-specific glycoproteins which often have cytoplasmically oriented O-linked N-acetylglucosamine residues and numerous repeats of the pentapeptide sequence XFXFG. However, the nucleoporin protein encoded by this gene does not contain the typical FG repeat sequences found in most vertebrate nucleoporins. This nucleoporin is thought to form part of the scaffold for the central channel of the nuclear pore. | 23511 | NUP188 | nucleoporin 188 | ENSG00000095319 |
| The paired immunoglobin-like type 2 receptors consist of highly related activating and inhibitory receptors that are involved in the regulation of many aspects of the immune system. The paired immunoglobulin-like receptor genes are located in a tandem head-to-tail orientation on chromosome 7. This gene encodes the activating member of the receptor pair and contains a truncated cytoplasmic tail relative to its inhibitory counterpart (PILRA), that has a long cytoplasmic tail with immunoreceptor tyrosine-based inhibitory (ITIM) motifs. This gene is thought to have arisen from a duplication of the inhibitory PILRA gene and evolved to acquire its activating function. | 29990 | PILRB | paired immunoglobin-like type 2 receptor beta | ENSG00000121716 |
| The protein encoded by this gene belongs to the family of P-type cation transport ATPases, and to the subfamily of aminophospholipid-transporting ATPases. The aminophospholipid translocases transport phosphatidylserine and phosphatidylethanolamine from one side of a bilayer to the other. This gene encodes member 3 of phospholipid-transporting ATPase 8B; other members of this protein family are located on chromosomes 1, 15 and 18. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 148229 | ATP8B3 | ATPase phospholipid transporting 8B3 | ENSG00000130270 |
| This gene encodes a member of the regulators of G protein signaling (RGS) family. The RGS proteins are signal transduction molecules which are involved in the regulation of heterotrimeric G proteins by acting as GTPase activators. This gene is a hypoxia-inducible factor-1 dependent, hypoxia-induced gene which is involved in the induction of endothelial apoptosis. This gene is also one of three genes on chromosome 1q contributing to elevated blood pressure. Alternatively spliced transcript variants have been identified. | 8490 | RGS5 | regulator of G-protein signaling 5 | ENSG00000143248 |
| This gene encodes an oncoprotein which is thought to play a role in the phenotypic determination of hemopoetic cells. Translocations between this gene and nucleophosmin have been associated with myelodysplastic syndrome and acute myeloid leukemia. Multiple transcript variants encoding different isoforms have been found for this gene. | 4291 | MLF1 | myeloid leukemia factor 1 | ENSG00000178053 |
| NA | 60509 | AGBL5 | ATP/GTP binding protein-like 5 | ENSG00000084693 |
| NA | 23589 | CARHSP1 | calcium regulated heat stable protein 1 | ENSG00000153048 |
| NA | 140576 | S100A16 | S100 calcium binding protein A16 | ENSG00000188643 |
| Cytosolic and membrane-bound forms of glutathione S-transferase are encoded by two distinct supergene families. At present, eight distinct classes of the soluble cytoplasmic mammalian glutathione S-transferases have been identified: alpha, kappa, mu, omega, pi, sigma, theta and zeta. This gene encodes a glutathione S-transferase that belongs to the mu class. The mu class of enzymes functions in the detoxification of electrophilic compounds, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress, by conjugation with glutathione. The genes encoding the mu class of enzymes are organized in a gene cluster on chromosome 1p13.3 and are known to be highly polymorphic. These genetic variations can change an individual’s susceptibility to carcinogens and toxins as well as affect the toxicity and efficacy of certain drugs. Mutations of this class mu gene have been linked with a slight increase in a number of cancers, likely due to exposure with environmental toxins. Alternative splicing results in multiple transcript variants. | 2947 | GSTM3 | glutathione S-transferase mu 3 (brain) | ENSG00000134202 |
| The enzyme encoded by this gene is a multifunctional protein. Its main function is to catalyze the synthesis of palmitate from acetyl-CoA and malonyl-CoA, in the presence of NADPH, into long-chain saturated fatty acids. In some cancer cell lines, this protein has been found to be fused with estrogen receptor-alpha (ER-alpha), in which the N-terminus of FAS is fused in-frame with the C-terminus of ER-alpha. | 2194 | FASN | fatty acid synthase | ENSG00000169710 |
| This gene encodes a member of the beta-transducin protein family. Most proteins of the beta-transducin family are involved in regulatory functions. This protein is possibly involved in some intracellular signaling pathway. This gene is deleted in Williams-Beuren syndrome, a developmental disorder caused by deletion of multiple genes at 7q11.23. | 26608 | TBL2 | transducin (beta)-like 2 | ENSG00000106638 |
| NA | 54535 | CCHCR1 | coiled-coil alpha-helical rod protein 1 | ENSG00000204536 |
| This gene encodes a conventional non-muscle myosin; this protein should not be confused with the unconventional myosin-9a or 9b (MYO9A or MYO9B). The encoded protein is a myosin IIA heavy chain that contains an IQ domain and a myosin head-like domain which is involved in several important functions, including cytokinesis, cell motility and maintenance of cell shape. Defects in this gene have been associated with non-syndromic sensorineural deafness autosomal dominant type 17, Epstein syndrome, Alport syndrome with macrothrombocytopenia, Sebastian syndrome, Fechtner syndrome and macrothrombocytopenia with progressive sensorineural deafness. | 4627 | MYH9 | myosin, heavy chain 9, non-muscle | ENSG00000100345 |
| NA | 200172 | SLFNL1 | schlafen like 1 | ENSG00000171790 |
| This gene encodes a member of the F-box protein family which is characterized by an approximately 40 amino acid motif, the F-box. The F-box proteins constitute one of the four subunits of the ubiquitin protein ligase complex called SCFs (SKP1-cullin-F-box), which function in phosphorylation-dependent ubiquitination. The F-box proteins are divided into 3 classes: Fbws containing WD-40 domains, Fbls containing leucine-rich repeats, and Fbxs containing either different protein-protein interaction modules or no recognizable motifs. The protein encoded by this gene belongs to the Fbxs class. Multiple transcript variants encoding different isoforms have been found for this gene. | 26261 | FBXO24 | F-box protein 24 | ENSG00000106336 |
| NA | 113177 | IZUMO4 | IZUMO family member 4 | ENSG00000099840 |
| Aminoacyl-tRNA synthetases catalyze the aminoacylation of tRNA by their cognate amino acid. Because of their central role in linking amino acids with nucleotide triplets contained in tRNAs, aminoacyl-tRNA synthetases are thought to be among the first proteins that appeared in evolution. The protein encoded by this gene belongs to class-I aminoacyl-tRNA synthetase family and is located in the class III region of the major histocompatibility complex. | 7407 | VARS | valyl-tRNA synthetase | ENSG00000204394 |
| The p70/p80 autoantigen is a nuclear complex consisting of two subunits with molecular masses of approximately 70 and 80 kDa. The complex functions as a single-stranded DNA-dependent ATP-dependent helicase. The complex may be involved in the repair of nonhomologous DNA ends such as that required for double-strand break repair, transposition, and V(D)J recombination. High levels of autoantibodies to p70 and p80 have been found in some patients with systemic lupus erythematosus. | 2547 | XRCC6 | X-ray repair cross complementing 6 | ENSG00000196419 |
| The protein encoded by this gene can function as a guanine nucleotide exchange factor (GEF) and may play a role in intracellular signaling and cytoskeleton dynamics at the Golgi apparatus. Polymorphisms in the region of this gene have been found to be associated with spinocerebellar ataxia in some study populations. Alternative splicing results in multiple transcript variants. | 25894 | PLEKHG4 | pleckstrin homology and RhoGEF domain containing G4 | ENSG00000196155 |
| SLC6A16 shows structural characteristics of an Na(+)- and Cl(-)-dependent neurotransmitter transporter, including 12 transmembrane (TM) domains, intracellular N and C termini, and large extracellular loops containing multiple N-glycosylation sites. | 28968 | SLC6A16 | solute carrier family 6 member 16 | ENSG00000063127 |
| This gene encodes the beta subunit of the mitochondrial trifunctional protein, which catalyzes the last three steps of mitochondrial beta-oxidation of long chain fatty acids. The mitochondrial membrane-bound heterocomplex is composed of four alpha and four beta subunits, with the beta subunit catalyzing the 3-ketoacyl-CoA thiolase activity. The encoded protein can also bind RNA and decreases the stability of some mRNAs. The genes of the alpha and beta subunits of the mitochondrial trifunctional protein are located adjacent to each other in the human genome in a head-to-head orientation. Mutations in this gene result in trifunctional protein deficiency. Alternatively spliced transcript variants encoding different isoforms have been described. | 3032 | HADHB | hydroxyacyl-CoA dehydrogenase/3-ketoacyl-CoA thiolase/enoyl-CoA hydratase (trifunctional protein), beta subunit | ENSG00000138029 |
| This gene encodes a member of the F-box protein family which is characterized by an approximately 40 amino acid motif, the F-box. The F-box proteins constitute one of the four subunits of ubiquitin protein ligase complex called SCFs (SKP1-cullin-F-box), which function in phosphorylation-dependent ubiquitination. The F-box proteins are divided into 3 classes: Fbws containing WD-40 domains, Fbls containing leucine-rich repeats, and Fbxs containing either different protein-protein interaction modules or no recognizable motifs. The protein encoded by this gene belongs to the Fbls class and, in addition to an F-box, contains at least six highly degenerated leucine-rich repeats. This family member plays a role in epigenetic silencing. It nucleates at CpG islands and specifically demethylates both mono- and di-methylated lysine-36 of histone H3. Alternative splicing results in multiple transcript variants. | 22992 | KDM2A | lysine demethylase 2A | ENSG00000173120 |
| DEAD box proteins, characterized by the conserved motif Asp-Glu-Ala-Asp (DEAD), are putative RNA helicases. They are implicated in a number of cellular processes involving alteration of RNA secondary structure such as translation initiation, nuclear and mitochondrial splicing, and ribosome and spliceosome assembly. Based on their distribution patterns, some members of this family are believed to be involved in embryogenesis, spermatogenesis, and cellular growth and division. This gene encodes a DEAD box protein, which has an ATPase activity and is a component of the survival of motor neurons (SMN) complex. This protein interacts directly with SMN, the spinal muscular atrophy gene product, and may play a catalytic role in the function of the SMN complex on RNPs. | 11218 | DDX20 | DEAD-box helicase 20 | ENSG00000064703 |
| This gene encodes a nebulin like protein that is abundantly expressed in cardiac muscle. The encoded protein binds actin and interacts with thin filaments and Z-line associated proteins in striated muscle. This protein may be involved in cardiac myofibril assembly. A shorter isoform of this protein termed LIM nebulette is expressed in non-muscle cells and may function as a component of focal adhesion complexes. Alternate splicing results in multiple transcript variants. | 10529 | NEBL | nebulette | ENSG00000078114 |
| NA | 101927055 | LOC101927055 | uncharacterized LOC101927055 | ENSG00000237298 |
| NA | 100506866 | TTN-AS1 | TTN antisense RNA 1 | ENSG00000237298 |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",2,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[3,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| X_id | summary | name | symbol | query | notfound |
|---|---|---|---|---|---|
| 72 | Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | actin, gamma 2, smooth muscle, enteric | ACTG2 | ENSG00000163017 | NA |
| 59 | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | actin, alpha 2, smooth muscle, aorta | ACTA2 | ENSG00000107796 | NA |
| 7273 | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | titin | TTN | ENSG00000155657 | NA |
| 3043 | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | hemoglobin subunit beta | HBB | ENSG00000244734 | NA |
| 1291 | The collagens are a superfamily of proteins that play a role in maintaining the integrity of various tissues. Collagens are extracellular matrix proteins and have a triple-helical domain as their common structural element. Collagen VI is a major structural component of microfibrils. The basic structural unit of collagen VI is a heterotrimer of the alpha1(VI), alpha2(VI), and alpha3(VI) chains. The alpha2(VI) and alpha3(VI) chains are encoded by the COL6A2 and COL6A3 genes, respectively. The protein encoded by this gene is the alpha 1 subunit of type VI collagen (alpha1(VI) chain). Mutations in the genes that code for the collagen VI subunits result in the autosomal dominant disorder, Bethlem myopathy. | collagen type VI alpha 1 | COL6A1 | ENSG00000142156 | NA |
| 58 | The product encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Mutations in this gene cause nemaline myopathy type 3, congenital myopathy with excess of thin myofilaments, congenital myopathy with cores, and congenital myopathy with fiber-type disproportion, diseases that lead to muscle fiber defects. | actin, alpha 1, skeletal muscle | ACTA1 | ENSG00000143632 | NA |
| 4629 | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | myosin, heavy chain 11, smooth muscle | MYH11 | ENSG00000133392 | NA |
| ENSG00000180139 | NA | ACTA2 antisense RNA 1 | ACTA2-AS1 | ENSG00000180139 | NA |
| 4638 | This gene, a muscle member of the immunoglobulin gene superfamily, encodes myosin light chain kinase which is a calcium/calmodulin dependent enzyme. This kinase phosphorylates myosin regulatory light chains to facilitate myosin interaction with actin filaments to produce contractile activity. This gene encodes both smooth muscle and nonmuscle isoforms. In addition, using a separate promoter in an intron in the 3’ region, it encodes telokin, a small protein identical in sequence to the C-terminus of myosin light chain kinase, that is independently expressed in smooth muscle and functions to stabilize unphosphorylated myosin filaments. A pseudogene is located on the p arm of chromosome 3. Four transcript variants that produce four isoforms of the calcium/calmodulin dependent enzyme have been identified as well as two transcripts that produce two isoforms of telokin. Additional variants have been identified but lack full length transcripts. | myosin light chain kinase | MYLK | ENSG00000065534 | NA |
| 60 | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | actin, beta | ACTB | ENSG00000075624 | NA |
| 5730 | The protein encoded by this gene is a glutathione-independent prostaglandin D synthase that catalyzes the conversion of prostaglandin H2 (PGH2) to postaglandin D2 (PGD2). PGD2 functions as a neuromodulator as well as a trophic factor in the central nervous system. PGD2 is also involved in smooth muscle contraction/relaxation and is a potent inhibitor of platelet aggregation. This gene is preferentially expressed in brain. Studies with transgenic mice overexpressing this gene suggest that this gene may be also involved in the regulation of non-rapid eye movement sleep. | prostaglandin D2 synthase | PTGDS | ENSG00000107317 | NA |
| 4155 | The protein encoded by the classic MBP gene is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. However, MBP-related transcripts are also present in the bone marrow and the immune system. These mRNAs arise from the long MBP gene (otherwise called ‘Golli-MBP’) that contains 3 additional exons located upstream of the classic MBP exons. Alternative splicing from the Golli and the MBP transcription start sites gives rise to 2 sets of MBP-related transcripts and gene products. The Golli mRNAs contain 3 exons unique to Golli-MBP, spliced in-frame to 1 or more MBP exons. They encode hybrid proteins that have N-terminal Golli aa sequence linked to MBP aa sequence. The second family of transcripts contain only MBP exons and produce the well characterized myelin basic proteins. This complex gene structure is conserved among species suggesting that the MBP transcription unit is an integral part of the Golli transcription unit and that this arrangement is important for the function and/or regulation of these genes. | myelin basic protein | MBP | ENSG00000197971 | NA |
| 4637 | Myosin is a hexameric ATPase cellular motor protein. It is composed of two heavy chains, two nonphosphorylatable alkali light chains, and two phosphorylatable regulatory light chains. This gene encodes a myosin alkali light chain that is expressed in smooth muscle and non-muscle tissues. Genomic sequences representing several pseudogenes have been described and two transcript variants encoding different isoforms have been identified for this gene. | myosin light chain 6 | MYL6 | ENSG00000092841 | NA |
| 23336 | The protein encoded by this gene is an intermediate filament (IF) family member. IF proteins are cytoskeletal proteins that confer resistance to mechanical stress and are encoded by a dispersed multigene family. This protein has been found to form a linkage between desmin, which is a subunit of the IF network, and the extracellular matrix, and provides an important structural support in muscle. Two alternatively spliced variants encoding different isoforms have been described for this gene. | synemin | SYNM | ENSG00000182253 | NA |
| 87 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a nonmuscle, cytoskeletal, alpha actinin isoform and maps to the same site as the structurally similar erythroid beta spectrin gene. Three transcript variants encoding different isoforms have been found for this gene. | actinin alpha 1 | ACTN1 | ENSG00000072110 | NA |
| 10398 | Myosin, a structural component of muscle, consists of two heavy chains and four light chains. The protein encoded by this gene is a myosin light chain that may regulate muscle contraction by modulating the ATPase activity of myosin heads. The encoded protein binds calcium and is activated by myosin light chain kinase. Two transcript variants encoding different isoforms have been found for this gene. | myosin light chain 9 | MYL9 | ENSG00000101335 | NA |
| 7052 | Transglutaminases are enzymes that catalyze the crosslinking of proteins by epsilon-gamma glutamyl lysine isopeptide bonds. While the primary structure of transglutaminases is not conserved, they all have the same amino acid sequence at their active sites and their activity is calcium-dependent. The protein encoded by this gene acts as a monomer, is induced by retinoic acid, and appears to be involved in apoptosis. Finally, the encoded protein is the autoantigen implicated in celiac disease. Two transcript variants encoding different isoforms have been found for this gene. | transglutaminase 2 | TGM2 | ENSG00000198959 | NA |
| NA | NA | NA | NA | ENSG00000259716 | TRUE |
| 84033 | The obscurin gene spans more than 150 kb, contains over 80 exons and encodes a protein of approximately 720 kDa. The encoded protein contains 68 Ig domains, 2 fibronectin domains, 1 calcium/calmodulin-binding domain, 1 RhoGEF domain with an associated PH domain, and 2 serine-threonine kinase domains. This protein belongs to the family of giant sacromeric signaling proteins that includes titin and nebulin, and may have a role in the organization of myofibrils during assembly and may mediate interactions between the sarcoplasmic reticulum and myofibrils. Alternatively spliced transcript variants encoding different isoforms have been identified. | obscurin, cytoskeletal calmodulin and titin-interacting RhoGEF | OBSCN | ENSG00000154358 | NA |
| 158471 | The protein encoded by this gene belongs to the B-cell CLL/lymphoma 2 and adenovirus E1B 19 kDa interacting family, whose members play roles in many cellular processes including apotosis, cell transformation, and synaptic function. Several functions for this protein have been demonstrated including suppression of Ras homolog family member A activity, which results in reduced stress fiber formation and suppression of oncogenic cellular transformation. A high molecular weight isoform of this protein has also been shown to colocalize with Adaptor protein complex 2, beta-Adaptin and endodermal markers, suggesting an involvement in post-endocytic trafficking. In prostate cancer cells, this gene acts as a tumor suppressor and its expression is regulated by prostate cancer antigen 3, a non-protein coding gene on the opposite DNA strand in an intron of this gene. Prostate cancer antigen 3 regulates levels of this gene through formation of a double-stranded RNA that undergoes adenosine deaminase actin on RNA-dependent adenosine-to-inosine RNA editing. Alternative splicing results in multiple transcript variants. | prune homolog 2 | PRUNE2 | ENSG00000106772 | NA |
| 3040 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | hemoglobin subunit alpha 2 | HBA2 | ENSG00000188536 | NA |
| 55679 | This gene encodes a member of a small family of focal adhesion proteins which interacts with ILK (integrin-linked kinase), a protein which effects protein-protein interactions with the extraceullar matrix. The encoded protein has five LIM domains, each domain forming two zinc fingers, which permit interactions which regulate cell shape and migration. A pseudogene of this gene is located on chromosome 4. Multiple transcript variants encoding different isoforms have been found for this gene. | LIM zinc finger domain containing 2 | LIMS2 | ENSG00000072163 | NA |
| 6876 | The protein encoded by this gene is a transformation and shape-change sensitive actin cross-linking/gelling protein found in fibroblasts and smooth muscle. Its expression is down-regulated in many cell lines, and this down-regulation may be an early and sensitive marker for the onset of transformation. A functional role of this protein is unclear. Two transcript variants encoding the same protein have been found for this gene. | transgelin | TAGLN | ENSG00000149591 | NA |
| 6319 | This gene encodes an enzyme involved in fatty acid biosynthesis, primarily the synthesis of oleic acid. The protein belongs to the fatty acid desaturase family and is an integral membrane protein located in the endoplasmic reticulum. Transcripts of approximately 3.9 and 5.2 kb, differing only by alternative polyadenlyation signals, have been detected. A gene encoding a similar enzyme is located on chromosome 4 and a pseudogene of this gene is located on chromosome 17. | stearoyl-CoA desaturase | SCD | ENSG00000099194 | NA |
| ENSG00000269936 | NA | NA | RP11-394O4.5 | ENSG00000269936 | NA |
| 11034 | The product of this gene belongs to the actin-binding proteins ADF family. This family of proteins is responsible for enhancing the turnover rate of actin in vivo. This gene encodes the actin depolymerizing protein that severs actin filaments (F-actin) and binds to actin monomers (G-actin). Two transcript variants encoding distinct isoforms have been identified for this gene. | destrin, actin depolymerizing factor | DSTN | ENSG00000125868 | NA |
| 4633 | Thus gene encodes the regulatory light chain associated with cardiac myosin beta (or slow) heavy chain. Ca+ triggers the phosphorylation of regulatory light chain that in turn triggers contraction. Mutations in this gene are associated with mid-left ventricular chamber type hypertrophic cardiomyopathy. | myosin light chain 2 | MYL2 | ENSG00000111245 | NA |
| 4162 | NA | melanoma cell adhesion molecule | MCAM | ENSG00000076706 | NA |
| 1465 | This gene encodes a member of the cysteine-rich protein (CSRP) family. This gene family includes a group of LIM domain proteins, which may be involved in regulatory processes important for development and cellular differentiation. The LIM/double zinc-finger motif found in this gene product occurs in proteins with critical functions in gene regulation, cell growth, and somatic differentiation. Alternatively spliced transcript variants have been described. | cysteine and glycine rich protein 1 | CSRP1 | ENSG00000159176 | NA |
| 2878 | This gene product belongs to the glutathione peroxidase family, which functions in the detoxification of hydrogen peroxide. It contains a selenocysteine (Sec) residue at its active site. The selenocysteine is encoded by the UGA codon, which normally signals translation termination. The 3’ UTR of Sec-containing genes have a common stem-loop structure, the sec insertion sequence (SECIS), which is necessary for the recognition of UGA as a Sec codon rather than as a stop signal. | glutathione peroxidase 3 | GPX3 | ENSG00000211445 | NA |
| 25959 | NA | KN motif and ankyrin repeat domains 2 | KANK2 | ENSG00000197256 | NA |
| 7168 | This gene is a member of the tropomyosin family of highly conserved, widely distributed actin-binding proteins involved in the contractile system of striated and smooth muscles and the cytoskeleton of non-muscle cells. Tropomyosin is composed of two alpha-helical chains arranged as a coiled-coil. It is polymerized end to end along the two grooves of actin filaments and provides stability to the filaments. The encoded protein is one type of alpha helical chain that forms the predominant tropomyosin of striated muscle, where it also functions in association with the troponin complex to regulate the calcium-dependent interaction of actin and myosin during muscle contraction. In smooth muscle and non-muscle cells, alternatively spliced transcript variants encoding a range of isoforms have been described. Mutations in this gene are associated with type 3 familial hypertrophic cardiomyopathy. | tropomyosin 1 (alpha) | TPM1 | ENSG00000140416 | NA |
| ENSG00000259627 | NA | NA | RP11-244F12.2 | ENSG00000259627 | NA |
| 2670 | This gene encodes one of the major intermediate filament proteins of mature astrocytes. It is used as a marker to distinguish astrocytes from other glial cells during development. Mutations in this gene cause Alexander disease, a rare disorder of astrocytes in the central nervous system. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | glial fibrillary acidic protein | GFAP | ENSG00000131095 | NA |
| 6279 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and as a cytokine. Altered expression of this protein is associated with the disease cystic fibrosis. Multiple transcript variants encoding different isoforms have been found for this gene. | S100 calcium binding protein A8 | S100A8 | ENSG00000143546 | NA |
| 88 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a muscle-specific, alpha actinin isoform that is expressed in both skeletal and cardiac muscles. Several transcript variants encoding different isoforms have been found for this gene. | actinin alpha 2 | ACTN2 | ENSG00000077522 | NA |
| 7134 | Troponin is a central regulatory protein of striated muscle contraction, and together with tropomyosin, is located on the actin filament. Troponin consists of 3 subunits: TnI, which is the inhibitor of actomyosin ATPase; TnT, which contains the binding site for tropomyosin; and TnC, the protein encoded by this gene. The binding of calcium to TnC abolishes the inhibitory action of TnI, thus allowing the interaction of actin with myosin, the hydrolysis of ATP, and the generation of tension. Mutations in this gene are associated with cardiomyopathy dilated type 1Z. | troponin C1, slow skeletal and cardiac type | TNNC1 | ENSG00000114854 | NA |
| 94274 | The protein encoded by this gene belongs to the protein phosphatase 1 (PP1) inhibitor family. This protein is an inhibitor of smooth muscle myosin phosphatase, and has higher inhibitory activity when phosphorylated. Inhibition of myosin phosphatase leads to increased myosin phosphorylation and enhanced smooth muscle contraction. Alternatively spliced transcript variants encoding different isoforms have been noted for this gene. | protein phosphatase 1 regulatory inhibitor subunit 14A | PPP1R14A | ENSG00000167641 | NA |
| 123 | The protein encoded by this gene belongs to the perilipin family, members of which coat intracellular lipid storage droplets. This protein is associated with the lipid globule surface membrane material, and maybe involved in development and maintenance of adipose tissue. However, it is not restricted to adipocytes as previously thought, but is found in a wide range of cultured cell lines, including fibroblasts, endothelial and epithelial cells, and tissues, such as lactating mammary gland, adrenal cortex, Sertoli and Leydig cells, and hepatocytes in alcoholic liver cirrhosis, suggesting that it may serve as a marker of lipid accumulation in diverse cell types and diseases. Alternatively spliced transcript variants have been found for this gene. | perilipin 2 | PLIN2 | ENSG00000147872 | NA |
| 6280 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and altered expression of this protein is associated with the disease cystic fibrosis. This antimicrobial protein exhibits antifungal and antibacterial activity. | S100 calcium binding protein A9 | S100A9 | ENSG00000163220 | NA |
| 3911 | This gene encodes one of the vertebrate laminin alpha chains. Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Laminins are composed of 3 non identical chains: laminin alpha, beta and gamma (formerly A, B1, and B2, respectively) and they form a cruciform structure consisting of 3 short arms, each formed by a different chain, and a long arm composed of all 3 chains. Each laminin chain is a multidomain protein encoded by a distinct gene. The protein encoded by this gene is the alpha-5 subunit of of laminin-10 (laminin-511), laminin-11 (laminin-521) and laminin-15 (laminin-523). | laminin subunit alpha 5 | LAMA5 | ENSG00000130702 | NA |
| 682 | The protein encoded by this gene is a plasma membrane protein that is important in spermatogenesis, embryo implantation, neural network formation, and tumor progression. The encoded protein is also a member of the immunoglobulin superfamily. Multiple transcript variants encoding different isoforms have been found for this gene. | basigin (Ok blood group) | BSG | ENSG00000172270 | NA |
| 493 | The protein encoded by this gene belongs to the family of P-type primary ion transport ATPases characterized by the formation of an aspartyl phosphate intermediate during the reaction cycle. These enzymes remove bivalent calcium ions from eukaryotic cells against very large concentration gradients and play a critical role in intracellular calcium homeostasis. The mammalian plasma membrane calcium ATPase isoforms are encoded by at least four separate genes and the diversity of these enzymes is further increased by alternative splicing of transcripts. The expression of different isoforms and splice variants is regulated in a developmental, tissue- and cell type-specific manner, suggesting that these pumps are functionally adapted to the physiological needs of particular cells and tissues. This gene encodes the plasma membrane calcium ATPase isoform 4. Alternatively spliced transcript variants encoding different isoforms have been identified. | ATPase plasma membrane Ca2+ transporting 4 | ATP2B4 | ENSG00000058668 | NA |
| 1277 | This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIA, Ehlers-Danlos syndrome Classical type, Caffey Disease and idiopathic osteoporosis. Reciprocal translocations between chromosomes 17 and 22, where this gene and the gene for platelet-derived growth factor beta are located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans, resulting from unregulated expression of the growth factor. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | collagen type I alpha 1 | COL1A1 | ENSG00000108821 | NA |
| 4256 | The protein encoded by this gene is secreted and likely acts as an inhibitor of bone formation. The encoded protein is found in the organic matrix of bone and cartilage. Defects in this gene are a cause of Keutel syndrome (KS). Two transcript variants encoding different isoforms have been found for this gene. | matrix Gla protein | MGP | ENSG00000111341 | NA |
| NA | NA | NA | NA | ENSG00000256545 | TRUE |
| 1158 | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis and is an important serum marker for myocardial infarction. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in striated muscle as well as in other tissues, and as a heterodimer with a similar brain isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. | creatine kinase, M-type | CKM | ENSG00000104879 | NA |
| 23413 | This gene is a member of the neuronal calcium sensor gene family, which encode calcium-binding proteins expressed predominantly in neurons. The protein encoded by this gene regulates G protein-coupled receptor phosphorylation in a calcium-dependent manner and can substitute for calmodulin. The protein is associated with secretory granules and modulates synaptic transmission and synaptic plasticity. Multiple transcript variants encoding different isoforms have been found for this gene. | neuronal calcium sensor 1 | NCS1 | ENSG00000107130 | NA |
| 2819 | This gene encodes a member of the NAD-dependent glycerol-3-phosphate dehydrogenase family. The encoded protein plays a critical role in carbohydrate and lipid metabolism by catalyzing the reversible conversion of dihydroxyacetone phosphate (DHAP) and reduced nicotine adenine dinucleotide (NADH) to glycerol-3-phosphate (G3P) and NAD+. The encoded cytosolic protein and mitochondrial glycerol-3-phosphate dehydrogenase also form a glycerol phosphate shuttle that facilitates the transfer of reducing equivalents from the cytosol to mitochondria. Mutations in this gene are a cause of transient infantile hypertriglyceridemia. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | glycerol-3-phosphate dehydrogenase 1 | GPD1 | ENSG00000167588 | NA |
| 7314 | This gene encodes ubiquitin, one of the most conserved proteins known. Ubiquitin has a major role in targeting cellular proteins for degradation by the 26S proteosome. It is also involved in the maintenance of chromatin structure, the regulation of gene expression, and the stress response. Ubiquitin is synthesized as a precursor protein consisting of either polyubiquitin chains or a single ubiquitin moiety fused to an unrelated protein. This gene consists of three direct repeats of the ubiquitin coding sequence with no spacer sequence. Consequently, the protein is expressed as a polyubiquitin precursor with a final amino acid after the last repeat. An aberrant form of this protein has been detected in patients with Alzheimer’s disease and Down syndrome. Pseudogenes of this gene are located on chromosomes 1, 2, 13, and 17. Alternative splicing results in multiple transcript variants. | ubiquitin B | UBB | ENSG00000170315 | NA |
| 8490 | This gene encodes a member of the regulators of G protein signaling (RGS) family. The RGS proteins are signal transduction molecules which are involved in the regulation of heterotrimeric G proteins by acting as GTPase activators. This gene is a hypoxia-inducible factor-1 dependent, hypoxia-induced gene which is involved in the induction of endothelial apoptosis. This gene is also one of three genes on chromosome 1q contributing to elevated blood pressure. Alternatively spliced transcript variants have been identified. | regulator of G-protein signaling 5 | RGS5 | ENSG00000143248 | NA |
| 4240 | This gene encodes a preproprotein that is proteolytically processed to form multiple protein products. The major encoded protein product, lactadherin, is a membrane glycoprotein that promotes phagocytosis of apoptotic cells. This protein has also been implicated in wound healing, autoimmune disease, and cancer. Lactadherin can be further processed to form a smaller cleavage product, medin, which comprises the major protein component of aortic medial amyloid (AMA). Alternative splicing results in multiple transcript variants. | milk fat globule-EGF factor 8 protein | MFGE8 | ENSG00000140545 | NA |
| 5837 | This gene encodes a muscle enzyme involved in glycogenolysis. Highly similar enzymes encoded by different genes are found in liver and brain. Mutations in this gene are associated with McArdle disease (myophosphorylase deficiency), a glycogen storage disease of muscle. Alternative splicing results in multiple transcript variants. | phosphorylase, glycogen, muscle | PYGM | ENSG00000068976 | NA |
| 23022 | This gene encodes a cytoskeletal protein that is required for organizing the actin cytoskeleton. The protein is a component of actin-containing microfilaments, and it is involved in the control of cell shape, adhesion, and contraction. Polymorphisms in this gene are associated with a susceptibility to pancreatic cancer type 1, and also with a risk for myocardial infarction. Alternative splicing results in multiple transcript variants. | palladin, cytoskeletal associated protein | PALLD | ENSG00000129116 | NA |
| 1809 | NA | dihydropyrimidinase like 3 | DPYSL3 | ENSG00000113657 | NA |
| 3860 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | keratin 13 | KRT13 | ENSG00000171401 | NA |
| 3039 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | hemoglobin subunit alpha 1 | HBA1 | ENSG00000206172 | NA |
| 8557 | Sarcomere assembly is regulated by the muscle protein titin. Titin is a giant elastic protein with kinase activity that extends half the length of a sarcomere. It serves as a scaffold to which myofibrils and other muscle related proteins are attached. This gene encodes a protein found in striated and cardiac muscle that binds to the titin Z1-Z2 domains and is a substrate of titin kinase, interactions thought to be critical to sarcomere assembly. Mutations in this gene are associated with limb-girdle muscular dystrophy type 2G. | titin-cap | TCAP | ENSG00000173991 | NA |
| 25802 | The leiomodin 1 protein has a putative membrane-spanning region and 2 types of tandemly repeated blocks. The transcript is expressed in all tissues tested, with the highest levels in thyroid, eye muscle, skeletal muscle, and ovary. Increased expression of leiomodin 1 may be linked to Graves’ disease and thyroid-associated ophthalmopathy. | leiomodin 1 | LMOD1 | ENSG00000163431 | NA |
| 9260 | The protein encoded by this gene is representative of a family of proteins composed of conserved PDZ and LIM domains. LIM domains are proposed to function in protein-protein recognition in a variety of contexts including gene transcription and development and in cytoskeletal interaction. The LIM domains of this protein bind to protein kinases, whereas the PDZ domain binds to actin filaments. The gene product is involved in the assembly of an actin filament-associated complex essential for transmission of ret/ptc2 mitogenic signaling. The biological function is likely to be that of an adapter, with the PDZ domain localizing the LIM-binding proteins to actin filaments of both skeletal muscle and nonmuscle tissues. Alternative splicing of this gene results in multiple transcript variants. | PDZ and LIM domain 7 | PDLIM7 | ENSG00000196923 | NA |
| 5166 | This gene is a member of the PDK/BCKDK protein kinase family and encodes a mitochondrial protein with a histidine kinase domain. This protein is located in the matrix of the mitrochondria and inhibits the pyruvate dehydrogenase complex by phosphorylating one of its subunits, thereby contributing to the regulation of glucose metabolism. Expression of this gene is regulated by glucocorticoids, retinoic acid and insulin. | pyruvate dehydrogenase kinase 4 | PDK4 | ENSG00000004799 | NA |
| 4624 | Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. | myosin, heavy chain 6, cardiac muscle, alpha | MYH6 | ENSG00000197616 | NA |
| 1281 | This gene encodes the pro-alpha1 chains of type III collagen, a fibrillar collagen that is found in extensible connective tissues such as skin, lung, uterus, intestine and the vascular system, frequently in association with type I collagen. Mutations in this gene are associated with Ehlers-Danlos syndrome types IV, and with aortic and arterial aneurysms. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | collagen type III alpha 1 chain | COL3A1 | ENSG00000168542 | NA |
| 7846 | Microtubules of the eukaryotic cytoskeleton perform essential and diverse functions and are composed of a heterodimer of alpha and beta tubulins. The genes encoding these microtubule constituents belong to the tubulin superfamily, which is composed of six distinct families. Genes from the alpha, beta and gamma tubulin families are found in all eukaryotes. The alpha and beta tubulins represent the major components of microtubules, while gamma tubulin plays a critical role in the nucleation of microtubule assembly. There are multiple alpha and beta tubulin genes, which are highly conserved among species. This gene encodes alpha tubulin and is highly similar to the mouse and rat Tuba1 genes. Northern blotting studies have shown that the gene expression is predominantly found in morphologically differentiated neurologic cells. This gene is one of three alpha-tubulin genes in a cluster on chromosome 12q. Mutations in this gene cause lissencephaly type 3 (LIS3) - a neurological condition characterized by microcephaly, mental retardation, and early-onset epilepsy and caused by defective neuronal migration. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | tubulin alpha 1a | TUBA1A | ENSG00000167552 | NA |
| 7138 | This gene encodes a protein that is a subunit of troponin, which is a regulatory complex located on the thin filament of the sarcomere. This complex regulates striated muscle contraction in response to fluctuations in intracellular calcium concentration. This complex is composed of three subunits: troponin C, which binds calcium, troponin T, which binds tropomyosin, and troponin I, which is an inhibitory subunit. This protein is the slow skeletal troponin T subunit. Mutations in this gene cause nemaline myopathy type 5, also known as Amish nemaline myopathy, a neuromuscular disorder characterized by muscle weakness and rod-shaped, or nemaline, inclusions in skeletal muscle fibers which affects infants, resulting in death due to respiratory insufficiency, usually in the second year. Multiple transcript variants encoding different isoforms have been found for this gene. | troponin T1, slow skeletal type | TNNT1 | ENSG00000105048 | NA |
| 283120 | This gene is located in an imprinted region of chromosome 11 near the insulin-like growth factor 2 (IGF2) gene. This gene is only expressed from the maternally-inherited chromosome, whereas IGF2 is only expressed from the paternally-inherited chromosome. The product of this gene is a long non-coding RNA which functions as a tumor suppressor. Mutations in this gene have been associated with Beckwith-Wiedemann Syndrome and Wilms tumorigenesis. Alternative splicing results in multiple transcript variants. | H19, imprinted maternally expressed transcript (non-protein coding) | H19 | ENSG00000130600 | NA |
| 5997 | Regulator of G protein signaling (RGS) family members are regulatory molecules that act as GTPase activating proteins (GAPs) for G alpha subunits of heterotrimeric G proteins. RGS proteins are able to deactivate G protein subunits of the Gi alpha, Go alpha and Gq alpha subtypes. They drive G proteins into their inactive GDP-bound forms. Regulator of G protein signaling 2 belongs to this family. The protein acts as a mediator of myeloid differentiation and may play a role in leukemogenesis. | regulator of G-protein signaling 2 | RGS2 | ENSG00000116741 | NA |
| 165 | This gene encodes a member of carboxypeptidase A protein family. The encoded protein may function as a transcriptional repressor and play a role in adipogenesis and smooth muscle cell differentiation. Studies in mice suggest that this gene functions in wound healing and abdominal wall development. Overexpression of this gene is associated with glioblastoma. | AE binding protein 1 | AEBP1 | ENSG00000106624 | NA |
| 283131 | This gene produces a long non-coding RNA (lncRNA) transcribed from the multiple endocrine neoplasia locus. This lncRNA is retained in the nucleus where it forms the core structural component of the paraspeckle sub-organelles. It may act as a transcriptional regulator for numerous genes, including some genes involved in cancer progression. | nuclear paraspeckle assembly transcript 1 (non-protein coding) | NEAT1 | ENSG00000245532 | NA |
| 7316 | This gene represents a ubiquitin gene, ubiquitin C. The encoded protein is a polyubiquitin precursor. Conjugation of ubiquitin monomers or polymers can lead to various effects within a cell, depending on the residues to which ubiquitin is conjugated. Ubiquitination has been associated with protein degradation, DNA repair, cell cycle regulation, kinase modification, endocytosis, and regulation of other cell signaling pathways. | ubiquitin C | UBC | ENSG00000150991 | NA |
| 122622 | This gene encodes a member of the adenylosuccinate synthase family of proteins. The encoded muscle-specific enzyme plays a role in the purine nucleotide cycle by catalyzing the first step in the conversion of inosine monophosphate (IMP) to adenosine monophosphate (AMP). Mutations in this gene may cause adolescent onset distal myopathy. Alternative splicing results in multiple transcript variants. | adenylosuccinate synthase like 1 | ADSSL1 | ENSG00000185100 | NA |
| ENSG00000266844 | NA | NA | RP11-862L9.3 | ENSG00000266844 | NA |
| 6525 | This gene encodes a structural protein that is found exclusively in contractile smooth muscle cells. It associates with stress fibers and constitutes part of the cytoskeleton. This gene is localized to chromosome 22q12.3, distal to the TUPLE1 locus and outside the DiGeorge syndrome deletion. Alternative splicing of this gene results in multiple transcript variants encoding distinct isoforms. | smoothelin | SMTN | ENSG00000183963 | NA |
| 3320 | The protein encoded by this gene is an inducible molecular chaperone that functions as a homodimer. The encoded protein aids in the proper folding of specific target proteins by use of an ATPase activity that is modulated by co-chaperones. Two transcript variants encoding different isoforms have been found for this gene. | heat shock protein 90kDa alpha family class A member 1 | HSP90AA1 | ENSG00000080824 | NA |
| 761 | Carbonic anhydrase III (CAIII) is a member of a multigene family (at least six separate genes are known) that encodes carbonic anhydrase isozymes. These carbonic anhydrases are a class of metalloenzymes that catalyze the reversible hydration of carbon dioxide and are differentially expressed in a number of cell types. The expression of the CA3 gene is strictly tissue specific and present at high levels in skeletal muscle and much lower levels in cardiac and smooth muscle. A proportion of carriers of Duchenne muscle dystrophy have a higher CA3 level than normal. The gene spans 10.3 kb and contains seven exons and six introns. | carbonic anhydrase 3 | CA3 | ENSG00000164879 | NA |
| 5950 | This protein belongs to the lipocalin family and is the specific carrier for retinol (vitamin A alcohol) in the blood. It delivers retinol from the liver stores to the peripheral tissues. In plasma, the RBP-retinol complex interacts with transthyretin which prevents its loss by filtration through the kidney glomeruli. A deficiency of vitamin A blocks secretion of the binding protein posttranslationally and results in defective delivery and supply to the epidermal cells. | retinol binding protein 4 | RBP4 | ENSG00000138207 | NA |
| 10875 | The protein encoded by this gene is a secreted protein that is similar to the beta- and gamma-chains of fibrinogen. The carboxyl-terminus of the encoded protein consists of the fibrinogen-related domains (FRED). The encoded protein forms a tetrameric complex which is stabilized by interchain disulfide bonds. This protein may play a role in physiologic functions at mucosal sites. | fibrinogen like 2 | FGL2 | ENSG00000127951 | NA |
| 388 | NA | ras homolog family member B | RHOB | ENSG00000143878 | NA |
| ENSG00000261054 | NA | NA | RP11-6O2.4 | ENSG00000261054 | NA |
| 844 | This gene encodes the skeletal muscle specific member of the calsequestrin protein family. Calsequestrin functions as a luminal sarcoplasmic reticulum calcium sensor in both cardiac and skeletal muscle cells. This protein, also known as calmitine, functions as a calcium regulator in the mitochondria of skeletal muscle. This protein is absent in patients with Duchenne and Becker types of muscular dystrophy. | calsequestrin 1 | CASQ1 | ENSG00000143318 | NA |
| 4625 | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | myosin, heavy chain 7, cardiac muscle, beta | MYH7 | ENSG00000092054 | NA |
| 51177 | NA | pleckstrin homology domain containing O1 | PLEKHO1 | ENSG00000023902 | NA |
| 7450 | This gene encodes a glycoprotein involved in hemostasis. The encoded preproprotein is proteolytically processed following assembly into large multimeric complexes. These complexes function in the adhesion of platelets to sites of vascular injury and the transport of various proteins in the blood. Mutations in this gene result in von Willebrand disease, an inherited bleeding disorder. An unprocessed pseudogene has been found on chromosome 22. | von Willebrand factor | VWF | ENSG00000110799 | NA |
| 4628 | This gene encodes a member of the myosin superfamily. The protein represents a conventional non-muscle myosin; it should not be confused with the unconventional myosin-10 (MYO10). Myosins are actin-dependent motor proteins with diverse functions including regulation of cytokinesis, cell motility, and cell polarity. Mutations in this gene have been associated with May-Hegglin anomaly and developmental defects in brain and heart. Multiple transcript variants encoding different isoforms have been found for this gene. | myosin, heavy chain 10, non-muscle | MYH10 | ENSG00000133026 | NA |
| 11030 | This gene encodes a member of the RNA recognition motif family of RNA-binding proteins. The RNA recognition motif is between 80-100 amino acids in length and family members contain one to four copies of the motif. The RNA recognition motif consists of two short stretches of conserved sequence, as well as a few highly conserved hydrophobic residues. The encoded protein has a single, putative RNA recognition motif in its N-terminus. Alternative splicing results in multiple transcript variants encoding different isoforms. | RNA binding protein with multiple splicing | RBPMS | ENSG00000157110 | NA |
| 2879 | This gene encodes a member of the glutathione peroxidase protein family. Glutathione peroxidase catalyzes the reduction of hydrogen peroxide, organic hydroperoxide, and lipid peroxides by reduced glutathione and functions in the protection of cells against oxidative damage. Human plasma glutathione peroxidase has been shown to be a selenium-containing enzyme and the UGA codon is translated into a selenocysteine. The encoded protein has been identified as a moonlighting protein based on its ability to serve dual functions as a peroxidase as well as a structural protein in mature spermatozoa. Through alternative splicing and transcription initiation, rat produces proteins that localize to the nucleus, mitochondrion, and cytoplasm. In humans, alternative transcription initiation and the cleavage sites of the mitochondrial and nuclear transit peptides need to be experimentally verified. Alternative splicing results in multiple transcript variants. | glutathione peroxidase 4 | GPX4 | ENSG00000167468 | NA |
| 3851 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | keratin 4 | KRT4 | ENSG00000170477 | NA |
| 58529 | The protein encoded by this gene is primarily expressed in the skeletal muscle, and belongs to the myozenin family. Members of this family function as calcineurin-interacting proteins that help tether calcineurin to the sarcomere of cardiac and skeletal muscle. They play an important role in modulation of calcineurin signaling. | myozenin 1 | MYOZ1 | ENSG00000177791 | NA |
| 5216 | This gene encodes a member of the profilin family of small actin-binding proteins. The encoded protein plays an important role in actin dynamics by regulating actin polymerization in response to extracellular signals. Deletion of this gene is associated with Miller-Dieker syndrome, and the encoded protein may also play a role in Huntington disease. Multiple pseudogenes of this gene are located on chromosome 1. | profilin 1 | PFN1 | ENSG00000108518 | NA |
| 51559 | NA | 5’-nucleotidase domain containing 3 | NT5DC3 | ENSG00000111696 | NA |
| 5662 | This gene encodes a Plekstrin homology and SEC7 domains-containing protein that functions as a guanine nucleotide exchange factor. The encoded protein regulates signal transduction by activating ADP-ribosylation factor 6. Alternative splicing results in multiple transcript variants. | pleckstrin and Sec7 domain containing | PSD | ENSG00000059915 | NA |
| 2027 | This gene encodes one of the three enolase isoenzymes found in mammals. This isoenzyme is found in skeletal muscle cells in the adult where it may play a role in muscle development and regeneration. A switch from alpha enolase to beta enolase occurs in muscle tissue during development in rodents. Mutations in this gene have be associated glycogen storage disease. Alternatively spliced transcript variants encoding different isoforms have been described. | enolase 3 | ENO3 | ENSG00000108515 | NA |
| 2192 | Fibulin 1 is a secreted glycoprotein that becomes incorporated into a fibrillar extracellular matrix. Calcium-binding is apparently required to mediate its binding to laminin and nidogen. It mediates platelet adhesion via binding fibrinogen. Four splice variants which differ in the 3’ end have been identified. Each variant encodes a different isoform, but no functional distinctions have been identified among the four variants. | fibulin 1 | FBLN1 | ENSG00000077942 | NA |
| 79026 | NA | AHNAK nucleoprotein | AHNAK | ENSG00000124942 | NA |
| 3858 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | keratin 10 | KRT10 | ENSG00000186395 | NA |
| 111 | This gene encodes a member of the membrane-bound adenylyl cyclase enzymes. Adenylyl cyclases mediate G protein-coupled receptor signaling through the synthesis of the second messenger cAMP. Activity of the encoded protein is stimulated by the Gs alpha subunit of G protein-coupled receptors and is inhibited by protein kinase A, calcium and Gi alpha subunits. Single nucleotide polymorphisms in this gene may be associated with low birth weight and type 2 diabetes. Alternatively spliced transcript variants that encode different isoforms have been observed for this gene. | adenylate cyclase 5 | ADCY5 | ENSG00000173175 | NA |
| 219 | This protein belongs to the aldehyde dehydrogenases family of proteins. Aldehyde dehydrogenase is the second enzyme of the major oxidative pathway of alcohol metabolism. This gene does not contain introns in the coding sequence. The variation of this locus may affect the development of alcohol-related problems. | aldehyde dehydrogenase 1 family member B1 | ALDH1B1 | ENSG00000137124 | NA |
| 2194 | The enzyme encoded by this gene is a multifunctional protein. Its main function is to catalyze the synthesis of palmitate from acetyl-CoA and malonyl-CoA, in the presence of NADPH, into long-chain saturated fatty acids. In some cancer cell lines, this protein has been found to be fused with estrogen receptor-alpha (ER-alpha), in which the N-terminus of FAS is fused in-frame with the C-terminus of ER-alpha. | fatty acid synthase | FASN | ENSG00000169710 | NA |
| 1917 | This gene encodes an isoform of the alpha subunit of the elongation factor-1 complex, which is responsible for the enzymatic delivery of aminoacyl tRNAs to the ribosome. This isoform (alpha 2) is expressed in brain, heart and skeletal muscle, and the other isoform (alpha 1) is expressed in brain, placenta, lung, liver, kidney, and pancreas. This gene may be critical in the development of ovarian cancer. | eukaryotic translation elongation factor 1 alpha 2 | EEF1A2 | ENSG00000101210 | NA |
| 5364 | NA | plexin B1 | PLXNB1 | ENSG00000164050 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",3,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[4,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| name | summary | X_id | query | symbol | notfound |
|---|---|---|---|---|---|
| collagen type VI alpha 3 chain | This gene encodes the alpha-3 chain, one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The alpha-3 chain of type VI collagen is much larger than the alpha-1 and -2 chains. This difference in size is largely due to an increase in the number of subdomains, similar to von Willebrand Factor type A domains, that are found in the amino terminal globular domain of all the alpha chains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in the type VI collagen genes are associated with Bethlem myopathy, a rare autosomal dominant proximal myopathy with early childhood onset. Mutations in this gene are also a cause of Ullrich congenital muscular dystrophy, also referred to as Ullrich scleroatonic muscular dystrophy, an autosomal recessive congenital myopathy that is more severe than Bethlem myopathy. Multiple transcript variants have been identified, but the full-length nature of only some of these variants has been described. | 1293 | ENSG00000163359 | COL6A3 | NA |
| collagen type XII alpha 1 chain | This gene encodes the alpha chain of type XII collagen, a member of the FACIT (fibril-associated collagens with interrupted triple helices) collagen family. Type XII collagen is a homotrimer found in association with type I collagen, an association that is thought to modify the interactions between collagen I fibrils and the surrounding matrix. Alternatively spliced transcript variants encoding different isoforms have been identified. | 1303 | ENSG00000111799 | COL12A1 | NA |
| alpha-2-macroglobulin | Alpha-2-macroglobulin is a protease inhibitor and cytokine transporter. It inhibits many proteases, including trypsin, thrombin and collagenase. A2M is implicated in Alzheimer disease (AD) due to its ability to mediate the clearance and degradation of A-beta, the major component of beta-amyloid deposits. | 2 | ENSG00000175899 | A2M | NA |
| fatty acid binding protein 4 | FABP4 encodes the fatty acid binding protein found in adipocytes. Fatty acid binding proteins are a family of small, highly conserved, cytoplasmic proteins that bind long-chain fatty acids and other hydrophobic ligands. It is thought that FABPs roles include fatty acid uptake, transport, and metabolism. | 2167 | ENSG00000170323 | FABP4 | NA |
| perilipin 1 | The protein encoded by this gene coats lipid storage droplets in adipocytes, thereby protecting them until they can be broken down by hormone-sensitive lipase. The encoded protein is the major cAMP-dependent protein kinase substrate in adipocytes and, when unphosphorylated, may play a role in the inhibition of lipolysis. Alternatively spliced transcript variants varying in the 5’ UTR, but encoding the same protein, have been found for this gene. | 5346 | ENSG00000166819 | PLIN1 | NA |
| membrane metallo-endopeptidase | This gene encodes a common acute lymphocytic leukemia antigen that is an important cell surface marker in the diagnosis of human acute lymphocytic leukemia (ALL). This protein is present on leukemic cells of pre-B phenotype, which represent 85% of cases of ALL. This protein is not restricted to leukemic cells, however, and is found on a variety of normal tissues. It is a glycoprotein that is particularly abundant in kidney, where it is present on the brush border of proximal tubules and on glomerular epithelium. The protein is a neutral endopeptidase that cleaves peptides at the amino side of hydrophobic residues and inactivates several peptide hormones including glucagon, enkephalins, substance P, neurotensin, oxytocin, and bradykinin. This gene, which encodes a 100-kD type II transmembrane glycoprotein, exists in a single copy of greater than 45 kb. The 5’ untranslated region of this gene is alternatively spliced, resulting in four separate mRNA transcripts. The coding region is not affected by alternative splicing. | 4311 | ENSG00000196549 | MME | NA |
| fatty acid synthase | The enzyme encoded by this gene is a multifunctional protein. Its main function is to catalyze the synthesis of palmitate from acetyl-CoA and malonyl-CoA, in the presence of NADPH, into long-chain saturated fatty acids. In some cancer cell lines, this protein has been found to be fused with estrogen receptor-alpha (ER-alpha), in which the N-terminus of FAS is fused in-frame with the C-terminus of ER-alpha. | 2194 | ENSG00000169710 | FASN | NA |
| stearoyl-CoA desaturase | This gene encodes an enzyme involved in fatty acid biosynthesis, primarily the synthesis of oleic acid. The protein belongs to the fatty acid desaturase family and is an integral membrane protein located in the endoplasmic reticulum. Transcripts of approximately 3.9 and 5.2 kb, differing only by alternative polyadenlyation signals, have been detected. A gene encoding a similar enzyme is located on chromosome 4 and a pseudogene of this gene is located on chromosome 17. | 6319 | ENSG00000099194 | SCD | NA |
| gremlin 1, DAN family BMP antagonist | This gene encodes a member of the BMP (bone morphogenic protein) antagonist family. Like BMPs, BMP antagonists contain cystine knots and typically form homo- and heterodimers. The CAN (cerberus and dan) subfamily of BMP antagonists, to which this gene belongs, is characterized by a C-terminal cystine knot with an eight-membered ring. The antagonistic effect of the secreted glycosylated protein encoded by this gene is likely due to its direct binding to BMP proteins. As an antagonist of BMP, this gene may play a role in regulating organogenesis, body patterning, and tissue differentiation. In mouse, this protein has been shown to relay the sonic hedgehog (SHH) signal from the polarizing region to the apical ectodermal ridge during limb bud outgrowth. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 26585 | ENSG00000166923 | GREM1 | NA |
| CD248 molecule | NA | 57124 | ENSG00000174807 | CD248 | NA |
| matrix Gla protein | The protein encoded by this gene is secreted and likely acts as an inhibitor of bone formation. The encoded protein is found in the organic matrix of bone and cartilage. Defects in this gene are a cause of Keutel syndrome (KS). Two transcript variants encoding different isoforms have been found for this gene. | 4256 | ENSG00000111341 | MGP | NA |
| thrombospondin 1 | The protein encoded by this gene is a subunit of a disulfide-linked homotrimeric protein. This protein is an adhesive glycoprotein that mediates cell-to-cell and cell-to-matrix interactions. This protein can bind to fibrinogen, fibronectin, laminin, type V collagen and integrins alpha-V/beta-1. This protein has been shown to play roles in platelet aggregation, angiogenesis, and tumorigenesis. | 7057 | ENSG00000137801 | THBS1 | NA |
| insulin like growth factor binding protein 4 | This gene is a member of the insulin-like growth factor binding protein (IGFBP) family and encodes a protein with an IGFBP domain and a thyroglobulin type-I domain. The protein binds both insulin-like growth factors (IGFs) I and II and circulates in the plasma in both glycosylated and non-glycosylated forms. Binding of this protein prolongs the half-life of the IGFs and alters their interaction with cell surface receptors. | 3487 | ENSG00000141753 | IGFBP4 | NA |
| retinol binding protein 4 | This protein belongs to the lipocalin family and is the specific carrier for retinol (vitamin A alcohol) in the blood. It delivers retinol from the liver stores to the peripheral tissues. In plasma, the RBP-retinol complex interacts with transthyretin which prevents its loss by filtration through the kidney glomeruli. A deficiency of vitamin A blocks secretion of the binding protein posttranslationally and results in defective delivery and supply to the epidermal cells. | 5950 | ENSG00000138207 | RBP4 | NA |
| perilipin 4 | Members of the perilipin family, such as PLIN4, coat intracellular lipid storage droplets (Wolins et al., 2003 [PubMed 12840023]). | 729359 | ENSG00000167676 | PLIN4 | NA |
| desmoplakin | This gene encodes a protein that anchors intermediate filaments to desmosomal plaques and forms an obligate component of functional desmosomes. Mutations in this gene are the cause of several cardiomyopathies and keratodermas, including skin fragility-woolly hair syndrome. Alternative splicing results in multiple transcript variants. | 1832 | ENSG00000096696 | DSP | NA |
| LDL receptor related protein 1 | This gene encodes a member of the low-density lipoprotein receptor family of proteins. The encoded preproprotein is proteolytically processed by furin to generate 515 kDa and 85 kDa subunits that form the mature receptor (PMID: 8546712). This receptor is involved in several cellular processes, including intracellular signaling, lipid homeostasis, and clearance of apoptotic cells. In addition, the encoded protein is necessary for the alpha 2-macroglobulin-mediated clearance of secreted amyloid precursor protein and beta-amyloid, the main component of amyloid plaques found in Alzheimer patients. Expression of this gene decreases with age and has been found to be lower than controls in brain tissue from Alzheimer’s disease patients. | 4035 | ENSG00000123384 | LRP1 | NA |
| complement component 3 | Complement component C3 plays a central role in the activation of complement system. Its activation is required for both classical and alternative complement activation pathways. The encoded preproprotein is proteolytically processed to generate alpha and beta subunits that form the mature protein, which is then further processed to generate numerous peptide products. The C3a peptide, also known as the C3a anaphylatoxin, modulates inflammation and possesses antimicrobial activity. Mutations in this gene are associated with atypical hemolytic uremic syndrome and age-related macular degeneration in human patients. | 718 | ENSG00000125730 | C3 | NA |
| HOP homeobox | The protein encoded by this gene is a homeodomain protein that lacks certain conserved residues required for DNA binding. It was reported that choriocarcinoma cell lines and tissues failed to express this gene, which suggested the possible involvement of this gene in malignant conversion of placental trophoblasts. Studies in mice suggest that this protein may interact with serum response factor (SRF) and modulate SRF-dependent cardiac-specific gene expression and cardiac development. Multiple alternatively spliced transcript variants have been identified for this gene. | 84525 | ENSG00000171476 | HOPX | NA |
| decorin | This gene encodes a member of the small leucine-rich proteoglycan family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature protein. This protein plays a role in collagen fibril assembly. Binding of this protein to multiple cell surface receptors mediates its role in tumor suppression, including a stimulatory effect on autophagy and inflammation and an inhibitory effect on angiogenesis and tumorigenesis. This gene and the related gene biglycan are thought to be the result of a gene duplication. Mutations in this gene are associated with congenital stromal corneal dystrophy in human patients. | 1634 | ENSG00000011465 | DCN | NA |
| glycerol-3-phosphate acyltransferase, mitochondrial | This gene encodes a mitochondrial enzyme which prefers saturated fatty acids as its substrate for the synthesis of glycerolipids. This metabolic pathway’s first step is catalyzed by the encoded enzyme. Two forms for this enzyme exist, one in the mitochondria and one in the endoplasmic reticulum. Two alternatively spliced transcript variants have been described for this gene. | 57678 | ENSG00000119927 | GPAM | NA |
| NA | NA | NA | ENSG00000256545 | NA | TRUE |
| collagen type I alpha 1 | This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIA, Ehlers-Danlos syndrome Classical type, Caffey Disease and idiopathic osteoporosis. Reciprocal translocations between chromosomes 17 and 22, where this gene and the gene for platelet-derived growth factor beta are located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans, resulting from unregulated expression of the growth factor. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | 1277 | ENSG00000108821 | COL1A1 | NA |
| nuclear receptor subfamily 4 group A member 1 | This gene encodes a member of the steroid-thyroid hormone-retinoid receptor superfamily. Expression is induced by phytohemagglutinin in human lymphocytes and by serum stimulation of arrested fibroblasts. The encoded protein acts as a nuclear transcription factor. Translocation of the protein from the nucleus to mitochondria induces apoptosis. Multiple transcript variants encoding different isoforms have been found for this gene. | 3164 | ENSG00000123358 | NR4A1 | NA |
| keratin 14 | This gene encodes a member of the keratin family, the most diverse group of intermediate filaments. This gene product, a type I keratin, is usually found as a heterotetramer with two keratin 5 molecules, a type II keratin. Together they form the cytoskeleton of epithelial cells. Mutations in the genes for these keratins are associated with epidermolysis bullosa simplex. At least one pseudogene has been identified at 17p12-p11. | 3861 | ENSG00000186847 | KRT14 | NA |
| acetyl-CoA carboxylase beta | Acetyl-CoA carboxylase (ACC) is a complex multifunctional enzyme system. ACC is a biotin-containing enzyme which catalyzes the carboxylation of acetyl-CoA to malonyl-CoA, the rate-limiting step in fatty acid synthesis. ACC-beta is thought to control fatty acid oxidation by means of the ability of malonyl-CoA to inhibit carnitine-palmitoyl-CoA transferase I, the rate-limiting step in fatty acid uptake and oxidation by mitochondria. ACC-beta may be involved in the regulation of fatty acid oxidation, rather than fatty acid biosynthesis. There is evidence for the presence of two ACC-beta isoforms. | 32 | ENSG00000076555 | ACACB | NA |
| serine peptidase inhibitor, Kunitz type, 2 | This gene encodes a transmembrane protein with two extracellular Kunitz domains that inhibits a variety of serine proteases. The protein inhibits HGF activator which prevents the formation of active hepatocyte growth factor. This gene is a putative tumor suppressor, and mutations in this gene result in congenital sodium diarrhea. Multiple transcript variants encoding different isoforms have been found for this gene. | 10653 | ENSG00000167642 | SPINT2 | NA |
| follistatin like 1 | This gene encodes a protein with similarity to follistatin, an activin-binding protein. It contains an FS module, a follistatin-like sequence containing 10 conserved cysteine residues. This gene product is thought to be an autoantigen associated with rheumatoid arthritis. | 11167 | ENSG00000163430 | FSTL1 | NA |
| eukaryotic translation elongation factor 1 alpha 1 | This gene encodes an isoform of the alpha subunit of the elongation factor-1 complex, which is responsible for the enzymatic delivery of aminoacyl tRNAs to the ribosome. This isoform (alpha 1) is expressed in brain, placenta, lung, liver, kidney, and pancreas, and the other isoform (alpha 2) is expressed in brain, heart and skeletal muscle. This isoform is identified as an autoantigen in 66% of patients with Felty syndrome. This gene has been found to have multiple copies on many chromosomes, some of which, if not all, represent different pseudogenes. | 1915 | ENSG00000156508 | EEF1A1 | NA |
| clusterin | The protein encoded by this gene is a secreted chaperone that can under some stress conditions also be found in the cell cytosol. It has been suggested to be involved in several basic biological events such as cell death, tumor progression, and neurodegenerative disorders. Alternate splicing results in both coding and non-coding variants. | 1191 | ENSG00000120885 | CLU | NA |
| collagen type VI alpha 1 | The collagens are a superfamily of proteins that play a role in maintaining the integrity of various tissues. Collagens are extracellular matrix proteins and have a triple-helical domain as their common structural element. Collagen VI is a major structural component of microfibrils. The basic structural unit of collagen VI is a heterotrimer of the alpha1(VI), alpha2(VI), and alpha3(VI) chains. The alpha2(VI) and alpha3(VI) chains are encoded by the COL6A2 and COL6A3 genes, respectively. The protein encoded by this gene is the alpha 1 subunit of type VI collagen (alpha1(VI) chain). Mutations in the genes that code for the collagen VI subunits result in the autosomal dominant disorder, Bethlem myopathy. | 1291 | ENSG00000142156 | COL6A1 | NA |
| complement component 1, s subcomponent | This gene encodes a serine protease, which is a major constituent of the human complement subcomponent C1. C1s associates with two other complement components C1r and C1q in order to yield the first component of the serum complement system. Defects in this gene are the cause of selective C1s deficiency. | 716 | ENSG00000182326 | C1S | NA |
| NA | NA | NA | ENSG00000117289 | NA | TRUE |
| matrix metallopeptidase 2 | This gene is a member of the matrix metalloproteinase (MMP) gene family, that are zinc-dependent enzymes capable of cleaving components of the extracellular matrix and molecules involved in signal transduction. The protein encoded by this gene is a gelatinase A, type IV collagenase, that contains three fibronectin type II repeats in its catalytic site that allow binding of denatured type IV and V collagen and elastin. Unlike most MMP family members, activation of this protein can occur on the cell membrane. This enzyme can be activated extracellularly by proteases, or, intracellulary by its S-glutathiolation with no requirement for proteolytical removal of the pro-domain. This protein is thought to be involved in multiple pathways including roles in the nervous system, endometrial menstrual breakdown, regulation of vascularization, and metastasis. Mutations in this gene have been associated with Winchester syndrome and Nodulosis-Arthropathy-Osteolysis (NAO) syndrome. Alternative splicing results in multiple transcript variants encoding different isoforms. | 4313 | ENSG00000087245 | MMP2 | NA |
| lipase E, hormone sensitive type | The protein encoded by this gene has a long and a short form, generated by use of alternative translational start codons. The long form is expressed in steroidogenic tissues such as testis, where it converts cholesteryl esters to free cholesterol for steroid hormone production. The short form is expressed in adipose tissue, among others, where it hydrolyzes stored triglycerides to free fatty acids. | 3991 | ENSG00000079435 | LIPE | NA |
| junction plakoglobin | This gene encodes a major cytoplasmic protein which is the only known constituent common to submembranous plaques of both desmosomes and intermediate junctions. This protein forms distinct complexes with cadherins and desmosomal cadherins and is a member of the catenin family since it contains a distinct repeating amino acid motif called the armadillo repeat. Mutation in this gene has been associated with Naxos disease. Alternative splicing occurs in this gene; however, not all transcripts have been fully described. | 3728 | ENSG00000173801 | JUP | NA |
| keratin 13 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | 3860 | ENSG00000171401 | KRT13 | NA |
| ATPase sarcoplasmic/endoplasmic reticulum Ca2+ transporting 2 | This gene encodes one of the SERCA Ca(2+)-ATPases, which are intracellular pumps located in the sarcoplasmic or endoplasmic reticula of muscle cells. This enzyme catalyzes the hydrolysis of ATP coupled with the translocation of calcium from the cytosol into the sarcoplasmic reticulum lumen, and is involved in regulation of the contraction/relaxation cycle. Mutations in this gene cause Darier-White disease, also known as keratosis follicularis, an autosomal dominant skin disorder characterized by loss of adhesion between epidermal cells and abnormal keratinization. Alternative splicing results in multiple transcript variants encoding different isoforms. | 488 | ENSG00000174437 | ATP2A2 | NA |
| heat shock protein family B (small) member 7 | NA | 27129 | ENSG00000173641 | HSPB7 | NA |
| carboxypeptidase A1 | This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. | 1357 | ENSG00000091704 | CPA1 | NA |
| actin, alpha 2, smooth muscle, aorta | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | 59 | ENSG00000107796 | ACTA2 | NA |
| transforming growth factor beta receptor 2 | This gene encodes a member of the Ser/Thr protein kinase family and the TGFB receptor subfamily. The encoded protein is a transmembrane protein that has a protein kinase domain, forms a heterodimeric complex with another receptor protein, and binds TGF-beta. This receptor/ligand complex phosphorylates proteins, which then enter the nucleus and regulate the transcription of a subset of genes related to cell proliferation. Mutations in this gene have been associated with Marfan Syndrome, Loeys-Deitz Aortic Aneurysm Syndrome, and the development of various types of tumors. Alternatively spliced transcript variants encoding different isoforms have been characterized. | 7048 | ENSG00000163513 | TGFBR2 | NA |
| glutamate-ammonia ligase | The protein encoded by this gene belongs to the glutamine synthetase family. It catalyzes the synthesis of glutamine from glutamate and ammonia in an ATP-dependent reaction. This protein plays a role in ammonia and glutamate detoxification, acid-base homeostasis, cell signaling, and cell proliferation. Glutamine is an abundant amino acid, and is important to the biosynthesis of several amino acids, pyrimidines, and purines. Mutations in this gene are associated with congenital glutamine deficiency, and overexpression of this gene was observed in some primary liver cancer samples. There are six pseudogenes of this gene found on chromosomes 2, 5, 9, 11, and 12. Alternative splicing results in multiple transcript variants. | 2752 | ENSG00000135821 | GLUL | NA |
| aspartate beta-hydroxylase | This gene is thought to play an important role in calcium homeostasis. The gene is expressed from two promoters and undergoes extensive alternative splicing. The encoded set of proteins share varying amounts of overlap near their N-termini but have substantial variations in their C-terminal domains resulting in distinct functional properties. The longest isoforms (a and f) include a C-terminal Aspartyl/Asparaginyl beta-hydroxylase domain that hydroxylates aspartic acid or asparagine residues in the epidermal growth factor (EGF)-like domains of some proteins, including protein C, coagulation factors VII, IX, and X, and the complement factors C1R and C1S. Other isoforms differ primarily in the C-terminal sequence and lack the hydroxylase domain, and some have been localized to the endoplasmic and sarcoplasmic reticulum. Some of these isoforms are found in complexes with calsequestrin, triadin, and the ryanodine receptor, and have been shown to regulate calcium release from the sarcoplasmic reticulum. Some isoforms have been implicated in metastasis. | 444 | ENSG00000198363 | ASPH | NA |
| cathepsin K | The protein encoded by this gene is a lysosomal cysteine proteinase involved in bone remodeling and resorption. This protein, which is a member of the peptidase C1 protein family, is predominantly expressed in osteoclasts. However, the encoded protein is also expressed in a significant fraction of human breast cancers, where it could contribute to tumor invasiveness. Mutations in this gene are the cause of pycnodysostosis, an autosomal recessive disease characterized by osteosclerosis and short stature. | 1513 | ENSG00000143387 | CTSK | NA |
| protease, serine 1 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | 5644 | ENSG00000204983 | PRSS1 | NA |
| microsomal glutathione S-transferase 1 | The MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism) family consists of six human proteins, two of which are involved in the production of leukotrienes and prostaglandin E, important mediators of inflammation. Other family members, demonstrating glutathione S-transferase and peroxidase activities, are involved in cellular defense against toxic, carcinogenic, and pharmacologically active electrophilic compounds. This gene encodes a protein that catalyzes the conjugation of glutathione to electrophiles and the reduction of lipid hydroperoxides. This protein is localized to the endoplasmic reticulum and outer mitochondrial membrane where it is thought to protect these membranes from oxidative stress. Several transcript variants, some non-protein coding and some protein coding, have been found for this gene. | 4257 | ENSG00000008394 | MGST1 | NA |
| keratin 1 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3848 | ENSG00000167768 | KRT1 | NA |
| insulin like growth factor binding protein 3 | This gene is a member of the insulin-like growth factor binding protein (IGFBP) family and encodes a protein with an IGFBP domain and a thyroglobulin type-I domain. The protein forms a ternary complex with insulin-like growth factor acid-labile subunit (IGFALS) and either insulin-like growth factor (IGF) I or II. In this form, it circulates in the plasma, prolonging the half-life of IGFs and altering their interaction with cell surface receptors. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. | 3486 | ENSG00000146674 | IGFBP3 | NA |
| CD36 molecule | The protein encoded by this gene is the fourth major glycoprotein of the platelet surface and serves as a receptor for thrombospondin in platelets and various cell lines. Since thrombospondins are widely distributed proteins involved in a variety of adhesive processes, this protein may have important functions as a cell adhesion molecule. It binds to collagen, thrombospondin, anionic phospholipids and oxidized LDL. It directly mediates cytoadherence of Plasmodium falciparum parasitized erythrocytes and it binds long chain fatty acids and may function in the transport and/or as a regulator of fatty acid transport. Mutations in this gene cause platelet glycoprotein deficiency. Multiple alternatively spliced transcript variants have been found for this gene. | 948 | ENSG00000135218 | CD36 | NA |
| adrenomedullin | The protein encoded by this gene is a preprohormone which is cleaved to form two biologically active peptides, adrenomedullin and proadrenomedullin N-terminal 20 peptide. Adrenomedullin is a 52 aa peptide with several functions, including vasodilation, regulation of hormone secretion, promotion of angiogenesis, and antimicrobial activity. The antimicrobial activity is antibacterial, as the peptide has been shown to kill E. coli and S. aureus at low concentration. | 133 | ENSG00000148926 | ADM | NA |
| keratin 10 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | 3858 | ENSG00000186395 | KRT10 | NA |
| integrin subunit alpha 8 | Integrins are heterodimeric transmembrane receptor proteins that mediate numerous cellular processes including cell adhesion, cytoskeletal rearrangement, and activation of cell signaling pathways. Integrins are composed of alpha and beta subunits. This gene encodes the alpha 8 subunit of the heterodimeric integrin alpha8beta1 protein. The encoded protein is a single-pass type 1 membrane protein that contains multiple FG-GAP repeats. This repeat is predicted to fold into a beta propeller structure. This gene regulates the recruitment of mesenchymal cells into epithelial structures, mediates cell-cell interactions, and regulates neurite outgrowth of sensory and motor neurons. The integrin alpha8beta1 protein thus plays an important role in wound-healing and organogenesis. Mutations in this gene have been associated with renal hypodysplasia/aplasia-1 (RHDA1) and with several animal models of chronic kidney disease. Alternate splicing results in multiple transcript variants encoding distinct isoforms. | 8516 | ENSG00000077943 | ITGA8 | NA |
| phosphoenolpyruvate carboxykinase 1 | This gene is a main control point for the regulation of gluconeogenesis. The cytosolic enzyme encoded by this gene, along with GTP, catalyzes the formation of phosphoenolpyruvate from oxaloacetate, with the release of carbon dioxide and GDP. The expression of this gene can be regulated by insulin, glucocorticoids, glucagon, cAMP, and diet. Defects in this gene are a cause of cytosolic phosphoenolpyruvate carboxykinase deficiency. A mitochondrial isozyme of the encoded protein also has been characterized. | 5105 | ENSG00000124253 | PCK1 | NA |
| interleukin 6 signal transducer | The protein encoded by this gene is a signal transducer shared by many cytokines, including interleukin 6 (IL6), ciliary neurotrophic factor (CNTF), leukemia inhibitory factor (LIF), and oncostatin M (OSM). This protein functions as a part of the cytokine receptor complex. The activation of this protein is dependent upon the binding of cytokines to their receptors. vIL6, a protein related to IL6 and encoded by the Kaposi sarcoma-associated herpesvirus, can bypass the interleukin 6 receptor (IL6R) and directly activate this protein. Knockout studies in mice suggest that this gene plays a critical role in regulating myocyte apoptosis. Alternatively spliced transcript variants have been described. A related pseudogene has been identified on chromosome 17. | 3572 | ENSG00000134352 | IL6ST | NA |
| caldesmon 1 | This gene encodes a calmodulin- and actin-binding protein that plays an essential role in the regulation of smooth muscle and nonmuscle contraction. The conserved domain of this protein possesses the binding activities to Ca(2+)-calmodulin, actin, tropomyosin, myosin, and phospholipids. This protein is a potent inhibitor of the actin-tropomyosin activated myosin MgATPase, and serves as a mediating factor for Ca(2+)-dependent inhibition of smooth muscle contraction. Alternative splicing of this gene results in multiple transcript variants encoding distinct isoforms. | 800 | ENSG00000122786 | CALD1 | NA |
| glyceraldehyde-3-phosphate dehydrogenase | This gene encodes a member of the glyceraldehyde-3-phosphate dehydrogenase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. The product of this gene catalyzes an important energy-yielding step in carbohydrate metabolism, the reversible oxidative phosphorylation of glyceraldehyde-3-phosphate in the presence of inorganic phosphate and nicotinamide adenine dinucleotide (NAD). The encoded protein has additionally been identified to have uracil DNA glycosylase activity in the nucleus. Also, this protein contains a peptide that has antimicrobial activity against E. coli, P. aeruginosa, and C. albicans. Studies of a similar protein in mouse have assigned a variety of additional functions including nitrosylation of nuclear proteins, the regulation of mRNA stability, and acting as a transferrin receptor on the cell surface of macrophage. Many pseudogenes similar to this locus are present in the human genome. Alternative splicing results in multiple transcript variants. | 2597 | ENSG00000111640 | GAPDH | NA |
| surfactant protein B | This gene encodes the pulmonary-associated surfactant protein B (SPB), an amphipathic surfactant protein essential for lung function and homeostasis after birth. Pulmonary surfactant is a surface-active lipoprotein complex composed of 90% lipids and 10% proteins which include plasma proteins and apolipoproteins SPA, SPB, SPC and SPD. The surfactant is secreted by the alveolar cells of the lung and maintains the stability of pulmonary tissue by reducing the surface tension of fluids that coat the lung. The SPB enhances the rate of spreading and increases the stability of surfactant monolayers in vitro. Multiple mutations in this gene have been identified, which cause pulmonary surfactant metabolism dysfunction type 1, also called pulmonary alveolar proteinosis due to surfactant protein B deficiency, and are associated with fatal respiratory distress in the neonatal period. Alternatively spliced transcript variants encoding the same protein have been identified. | 6439 | ENSG00000168878 | SFTPB | NA |
| transforming growth factor beta induced | This gene encodes an RGD-containing protein that binds to type I, II and IV collagens. The RGD motif is found in many extracellular matrix proteins modulating cell adhesion and serves as a ligand recognition sequence for several integrins. This protein plays a role in cell-collagen interactions and may be involved in endochondrial bone formation in cartilage. The protein is induced by transforming growth factor-beta and acts to inhibit cell adhesion. Mutations in this gene are associated with multiple types of corneal dystrophy. | 7045 | ENSG00000120708 | TGFBI | NA |
| serum amyloid A1 | This gene encodes a member of the serum amyloid A family of apolipoproteins. The encoded preproprotein is proteolytically processed to generate the mature protein. This protein is a major acute phase protein that is highly expressed in response to inflammation and tissue injury. This protein also plays an important role in HDL metabolism and cholesterol homeostasis. High levels of this protein are associated with chronic inflammatory diseases including atherosclerosis, rheumatoid arthritis, Alzheimer’s disease and Crohn’s disease. This protein may also be a potential biomarker for certain tumors. Alternate splicing results in multiple transcript variants that encode the same protein. A pseudogene of this gene is found on chromosome 11. | 6288 | ENSG00000173432 | SAA1 | NA |
| pancreatic lipase | This gene is a member of the lipase gene family. It encodes a carboxyl esterase that hydrolyzes insoluble, emulsified triglycerides, and is essential for the efficient digestion of dietary fats. This gene is expressed specifically in the pancreas. | 5406 | ENSG00000175535 | PNLIP | NA |
| discoidin domain receptor tyrosine kinase 2 | Receptor tyrosine kinases (RTKs) play a key role in the communication of cells with their microenvironment. These molecules are involved in the regulation of cell growth, differentiation, and metabolism. In several cases the biochemical mechanism by which RTKs transduce signals across the membrane has been shown to be ligand induced receptor oligomerization and subsequent intracellular phosphorylation. This autophosphorylation leads to phosphorylation of cytosolic targets as well as association with other molecules, which are involved in pleiotropic effects of signal transduction. RTKs have a tripartite structure with extracellular, transmembrane, and cytoplasmic regions. This gene encodes a member of a novel subclass of RTKs and contains a distinct extracellular region encompassing a factor VIII-like domain. Alternative splicing in the 5’ UTR results in multiple transcript variants encoding the same protein. | 4921 | ENSG00000162733 | DDR2 | NA |
| eukaryotic translation elongation factor 1 alpha 1 pseudogene 5 | NA | ENSG00000196205 | ENSG00000196205 | EEF1A1P5 | NA |
| transforming growth factor beta receptor 3 | This locus encodes the transforming growth factor (TGF)-beta type III receptor. The encoded receptor is a membrane proteoglycan that often functions as a co-receptor with other TGF-beta receptor superfamily members. Ectodomain shedding produces soluble TGFBR3, which may inhibit TGFB signaling. Decreased expression of this receptor has been observed in various cancers. Alternatively spliced transcript variants encoding different isoforms have been identified for this gene. | 7049 | ENSG00000069702 | TGFBR3 | NA |
| apolipoprotein D | This gene encodes a component of high density lipoprotein that has no marked similarity to other apolipoprotein sequences. It has a high degree of homology to plasma retinol-binding protein and other members of the alpha 2 microglobulin protein superfamily of carrier proteins, also known as lipocalins. This glycoprotein is closely associated with the enzyme lecithin:cholesterol acyltransferase - an enzyme involved in lipoprotein metabolism. | 347 | ENSG00000189058 | APOD | NA |
| major histocompatibility complex, class I, B | HLA-B belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. Class I molecules play a central role in the immune system by presenting peptides derived from the endoplasmic reticulum lumen. They are expressed in nearly all cells. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon 1 encodes the leader peptide, exon 2 and 3 encode the alpha1 and alpha2 domains, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region and exons 6 and 7 encode the cytoplasmic tail. Polymorphisms within exon 2 and exon 3 are responsible for the peptide binding specificity of each class one molecule. Typing for these polymorphisms is routinely done for bone marrow and kidney transplantation. Hundreds of HLA-B alleles have been described. | 3106 | ENSG00000234745 | HLA-B | NA |
| DAB2, clathrin adaptor protein | This gene encodes a mitogen-responsive phosphoprotein. It is expressed in normal ovarian epithelial cells, but is down-regulated or absent from ovarian carcinoma cell lines, suggesting its role as a tumor suppressor. This protein binds to the SH3 domains of GRB2, an adaptor protein that couples tyrosine kinase receptors to SOS (a guanine nucleotide exchange factor for Ras), via its C-terminal proline-rich sequences, and may thus modulate growth factor/Ras pathways by competing with SOS for binding to GRB2. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 1601 | ENSG00000153071 | DAB2 | NA |
| lysyl oxidase like 2 | This gene encodes a member of the lysyl oxidase gene family. The prototypic member of the family is essential to the biogenesis of connective tissue, encoding an extracellular copper-dependent amine oxidase that catalyses the first step in the formation of crosslinks in collagens and elastin. A highly conserved amino acid sequence at the C-terminus end appears to be sufficient for amine oxidase activity, suggesting that each family member may retain this function. The N-terminus is poorly conserved and may impart additional roles in developmental regulation, senescence, tumor suppression, cell growth control, and chemotaxis to each member of the family. | 4017 | ENSG00000134013 | LOXL2 | NA |
| carboxypeptidase B1 | Three different procarboxypeptidases A and two different procarboxypeptidases B have been isolated. The B1 and B2 forms differ from each other mainly in isoelectric point. Carboxypeptidase B1 is a highly tissue-specific protein and is a useful serum marker for acute pancreatitis and dysfunction of pancreatic transplants. It is not elevated in pancreatic carcinoma. | 1360 | ENSG00000153002 | CPB1 | NA |
| EH domain containing 2 | This gene encodes a member of the EH domain-containing protein family. These proteins are characterized by a C-terminal EF-hand domain, a nucleotide-binding consensus site at the N terminus and a bipartite nuclear localization signal. The encoded protein interacts with the actin cytoskeleton through an N-terminal domain and also binds to an EH domain-binding protein through the C-terminal EH domain. This interaction appears to connect clathrin-dependent endocytosis to actin, suggesting that this gene product participates in the endocytic pathway. | 30846 | ENSG00000024422 | EHD2 | NA |
| desmin | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | 1674 | ENSG00000175084 | DES | NA |
| nicotinamide N-methyltransferase | N-methylation is one method by which drug and other xenobiotic compounds are metabolized by the liver. This gene encodes the protein responsible for this enzymatic activity which uses S-adenosyl methionine as the methyl donor. | 4837 | ENSG00000166741 | NNMT | NA |
| fibrillin 1 | This gene encodes a member of the fibrillin family of proteins. The encoded preproprotein is proteolytically processed to generate two proteins including the extracellular matrix component fibrillin-1 and the protein hormone asprosin. Fibrillin-1 is an extracellular matrix glycoprotein that serves as a structural component of calcium-binding microfibrils. These microfibrils provide force-bearing structural support in elastic and nonelastic connective tissue throughout the body. Asprosin, secreted by white adipose tissue, has been shown to regulate glucose homeostasis. Mutations in this gene are associated with Marfan syndrome and the related MASS phenotype, as well as ectopia lentis syndrome, Weill-Marchesani syndrome, Shprintzen-Goldberg syndrome and neonatal progeroid syndrome. | 2200 | ENSG00000166147 | FBN1 | NA |
| glycoprotein 2 | This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. | 2813 | ENSG00000169347 | GP2 | NA |
| syndecan 1 | The protein encoded by this gene is a transmembrane (type I) heparan sulfate proteoglycan and is a member of the syndecan proteoglycan family. The syndecans mediate cell binding, cell signaling, and cytoskeletal organization and syndecan receptors are required for internalization of the HIV-1 tat protein. The syndecan-1 protein functions as an integral membrane protein and participates in cell proliferation, cell migration and cell-matrix interactions via its receptor for extracellular matrix proteins. Altered syndecan-1 expression has been detected in several different tumor types. While several transcript variants may exist for this gene, the full-length natures of only two have been described to date. These two represent the major variants of this gene and encode the same protein. | 6382 | ENSG00000115884 | SDC1 | NA |
| complement C1r subcomponent | NA | 715 | ENSG00000159403 | C1R | NA |
| aconitase 1 | The protein encoded by this gene is a bifunctional, cytosolic protein that functions as an essential enzyme in the TCA cycle and interacts with mRNA to control the levels of iron inside cells. When cellular iron levels are high, this protein binds to a 4Fe-4S cluster and functions as an aconitase. Aconitases are iron-sulfur proteins that function to catalyze the conversion of citrate to isocitrate. When cellular iron levels are low, the protein binds to iron-responsive elements (IREs), which are stem-loop structures found in the 5’ UTR of ferritin mRNA, and in the 3’ UTR of transferrin receptor mRNA. When the protein binds to IRE, it results in repression of translation of ferritin mRNA, and inhibition of degradation of the otherwise rapidly degraded transferrin receptor mRNA. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. Alternative splicing results in multiple transcript variants | 48 | ENSG00000122729 | ACO1 | NA |
| ectonucleotide pyrophosphatase/phosphodiesterase 2 | The protein encoded by this gene functions as both a phosphodiesterase, which cleaves phosphodiester bonds at the 5’ end of oligonucleotides, and a phospholipase, which catalyzes production of lysophosphatidic acid (LPA) in extracellular fluids. LPA evokes growth factor-like responses including stimulation of cell proliferation and chemotaxis. This gene product stimulates the motility of tumor cells and has angiogenic properties, and its expression is upregulated in several kinds of carcinomas. The gene product is secreted and further processed to make the biologically active form. Several alternatively spliced transcript variants encoding different isoforms have been identified. | 5168 | ENSG00000136960 | ENPP2 | NA |
| sparc/osteonectin, cwcv and kazal-like domains proteoglycan (testican) 1 | This gene encodes the protein core of a seminal plasma proteoglycan containing chondroitin- and heparan-sulfate chains. The protein’s function is unknown, although similarity to thyropin-type cysteine protease-inhibitors suggests its function may be related to protease inhibition. | 6695 | ENSG00000152377 | SPOCK1 | NA |
| immunoglobulin heavy constant gamma 1 (G1m marker) | NA | ENSG00000211896 | ENSG00000211896 | IGHG1 | NA |
| dermokine | This gene is upregulated in inflammatory diseases, and it was first observed as expressed in the differentiated layers of skin. The most interesting aspect of this gene is the differential use of promoters and terminators to generate isoforms with unique cellular distributions and domain components. Alternatively spliced transcript variants encoding different isoforms have been identified for this gene. | 93099 | ENSG00000161249 | DMKN | NA |
| eukaryotic translation elongation factor 2 | This gene encodes a member of the GTP-binding translation elongation factor family. This protein is an essential factor for protein synthesis. It promotes the GTP-dependent translocation of the nascent protein chain from the A-site to the P-site of the ribosome. This protein is completely inactivated by EF-2 kinase phosporylation. | 1938 | ENSG00000167658 | EEF2 | NA |
| TIMP metallopeptidase inhibitor 4 | This gene belongs to the TIMP gene family. The proteins encoded by this gene family are inhibitors of the matrix metalloproteinases, a group of peptidases involved in degradation of the extracellular matrix. The secreted, netrin domain-containing protein encoded by this gene is involved in regulation of platelet aggregation and recruitment and may play role in hormonal regulation and endometrial tissue remodeling. | 7079 | ENSG00000157150 | TIMP4 | NA |
| cathepsin B | This gene encodes a member of the C1 family of peptidases. Alternative splicing of this gene results in multiple transcript variants. At least one of these variants encodes a preproprotein that is proteolytically processed to generate multiple protein products. These products include the cathepsin B light and heavy chains, which can dimerize to form the double chain form of the enzyme. This enzyme is a lysosomal cysteine protease with both endopeptidase and exopeptidase activity that may play a role in protein turnover. It is also known as amyloid precursor protein secretase and is involved in the proteolytic processing of amyloid precursor protein (APP). Incomplete proteolytic processing of APP has been suggested to be a causative factor in Alzheimer’s disease, the most common cause of dementia. Overexpression of the encoded protein has been associated with esophageal adenocarcinoma and other tumors. Multiple pseudogenes of this gene have been identified. | 1508 | ENSG00000164733 | CTSB | NA |
| protocadherin 18 | This gene belongs to the protocadherin gene family, a subfamily of the cadherin superfamily. This gene encodes a protein which contains 6 extracellular cadherin domains, a transmembrane domain and a cytoplasmic tail differing from those of the classical cadherins. Although its specific function is undetermined, the cadherin-related neuronal receptor is thought to play a role in the establishment and function of specific cell-cell connections in the brain. | 54510 | ENSG00000189184 | PCDH18 | NA |
| chymotrypsin like elastase family member 3A | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. | 10136 | ENSG00000142789 | CELA3A | NA |
| septin 11 | SEPT11 belongs to the conserved septin family of filament-forming cytoskeletal GTPases that are involved in a variety of cellular functions including cytokinesis and vesicle trafficking (Hanai et al., 2004 [PubMed 15196925]; Nagata et al., 2004 [PubMed 15485874]). | 55752 | ENSG00000138758 | SEPT11 | NA |
| myosin light chain kinase | This gene, a muscle member of the immunoglobulin gene superfamily, encodes myosin light chain kinase which is a calcium/calmodulin dependent enzyme. This kinase phosphorylates myosin regulatory light chains to facilitate myosin interaction with actin filaments to produce contractile activity. This gene encodes both smooth muscle and nonmuscle isoforms. In addition, using a separate promoter in an intron in the 3’ region, it encodes telokin, a small protein identical in sequence to the C-terminus of myosin light chain kinase, that is independently expressed in smooth muscle and functions to stabilize unphosphorylated myosin filaments. A pseudogene is located on the p arm of chromosome 3. Four transcript variants that produce four isoforms of the calcium/calmodulin dependent enzyme have been identified as well as two transcripts that produce two isoforms of telokin. Additional variants have been identified but lack full length transcripts. | 4638 | ENSG00000065534 | MYLK | NA |
| REV3 like, DNA directed polymerase zeta catalytic subunit | NA | 5980 | ENSG00000009413 | REV3L | NA |
| keratin 4 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3851 | ENSG00000170477 | KRT4 | NA |
| ERBB receptor feedback inhibitor 1 | ERRFI1 is a cytoplasmic protein whose expression is upregulated with cell growth (Wick et al., 1995 [PubMed 7641805]). It shares significant homology with the protein product of rat gene-33, which is induced during cell stress and mediates cell signaling (Makkinje et al., 2000 [PubMed 10749885]; Fiorentino et al., 2000 [PubMed 11003669]). | 54206 | ENSG00000116285 | ERRFI1 | NA |
| protamine 2 | Protamines substitute for histones in the chromatin of sperm during the haploid phase of spermatogenesis, and are the major DNA-binding proteins in the nucleus of sperm in many vertebrates. They package the sperm DNA into a highly condensed complex in a volume less than 5% of a somatic cell nucleus. Many mammalian species have only one protamine (protamine 1); however, a few species, including human and mouse, have two. This gene encodes protamine 2, which is cleaved to give rise to a family of protamine 2 peptides. Alternatively spliced transcript variants have also been found for this gene. | 5620 | ENSG00000122304 | PRM2 | NA |
| carboxyl ester lipase | The protein encoded by this gene is a glycoprotein secreted from the pancreas into the digestive tract and from the lactating mammary gland into human milk. The physiological role of this protein is in cholesterol and lipid-soluble vitamin ester hydrolysis and absorption. This encoded protein promotes large chylomicron production in the intestine. Also its presence in plasma suggests its interactions with cholesterol and oxidized lipoproteins to modulate the progression of atherosclerosis. In pancreatic tumoral cells, this encoded protein is thought to be sequestrated within the Golgi compartment and is probably not secreted. This gene contains a variable number of tandem repeat (VNTR) polymorphism in the coding region that may influence the function of the encoded protein. | 1056 | ENSG00000170835 | CEL | NA |
| coiled-coil domain containing 80 | NA | 151887 | ENSG00000091986 | CCDC80 | NA |
| cell death inducing DFFA like effector c | This gene encodes a member of the cell death-inducing DNA fragmentation factor-like effector family. Members of this family play important roles in apoptosis. The encoded protein promotes lipid droplet formation in adipocytes and may mediate adipocyte apoptosis. This gene is regulated by insulin and its expression is positively correlated with insulin sensitivity. Mutations in this gene may contribute to insulin resistant diabetes. A pseudogene of this gene is located on the short arm of chromosome 3. Alternatively spliced transcript variants that encode different isoforms have been observed for this gene. | 63924 | ENSG00000187288 | CIDEC | NA |
| 1-acylglycerol-3-phosphate O-acyltransferase 2 | This gene encodes a member of the 1-acylglycerol-3-phosphate O-acyltransferase family. The protein is located within the endoplasmic reticulum membrane and converts lysophosphatidic acid to phosphatidic acid, the second step in de novo phospholipid biosynthesis. Mutations in this gene have been associated with congenital generalized lipodystrophy (CGL), or Berardinelli-Seip syndrome, a disease characterized by a near absence of adipose tissue and severe insulin resistance. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. | 10555 | ENSG00000169692 | AGPAT2 | NA |
| creatine kinase, M-type | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis and is an important serum marker for myocardial infarction. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in striated muscle as well as in other tissues, and as a heterodimer with a similar brain isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. | 1158 | ENSG00000104879 | CKM | NA |
| acyl-CoA synthetase long-chain family member 1 | The protein encoded by this gene is an isozyme of the long-chain fatty-acid-coenzyme A ligase family. Although differing in substrate specificity, subcellular localization, and tissue distribution, all isozymes of this family convert free long-chain fatty acids into fatty acyl-CoA esters, and thereby play a key role in lipid biosynthesis and fatty acid degradation. Several transcript variants encoding different isoforms have been found for this gene. | 2180 | ENSG00000151726 | ACSL1 | NA |
| ribosomal protein S2 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S5P family of ribosomal proteins. It is located in the cytoplasm. This gene shares sequence similarity with mouse LLRep3. It is co-transcribed with the small nucleolar RNA gene U64, which is located in its third intron. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | 6187 | ENSG00000140988 | RPS2 | NA |
| complement factor D | This gene encodes a member of the S1, or chymotrypsin, family of serine peptidases. This protease catalyzes the cleavage of factor B, the rate-limiting step of the alternative pathway of complement activation. This protein also functions as an adipokine, a cell signaling protein secreted by adipocytes, which regulates insulin secretion in mice. Mutations in this gene underlie complement factor D deficiency, which is associated with recurrent bacterial meningitis infections in human patients. Alternative splicing of this gene results in multiple transcript variants. At least one of these variants encodes a preproprotein that is proteolytically processed to generate the mature protease. | 1675 | ENSG00000197766 | CFD | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",4,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[5,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| X_id | symbol | summary | query | name |
|---|---|---|---|---|
| 6616 | SNAP25 | Synaptic vesicle membrane docking and fusion is mediated by SNAREs (soluble N-ethylmaleimide-sensitive factor attachment protein receptors) located on the vesicle membrane (v-SNAREs) and the target membrane (t-SNAREs). The assembled v-SNARE/t-SNARE complex consists of a bundle of four helices, one of which is supplied by v-SNARE and the other three by t-SNARE. For t-SNAREs on the plasma membrane, the protein syntaxin supplies one helix and the protein encoded by this gene contributes the other two. Therefore, this gene product is a presynaptic plasma membrane protein involved in the regulation of neurotransmitter release. Two alternative transcript variants encoding different protein isoforms have been described for this gene. | ENSG00000132639 | synaptosome associated protein 25 |
| 3043 | HBB | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | ENSG00000244734 | hemoglobin subunit beta |
| 801 | CALM1 | This gene encodes a member of the EF-hand calcium-binding protein family. It is one of three genes which encode an identical calcium binding protein which is one of the four subunits of phosphorylase kinase. Two pseudogenes have been identified on chromosome 7 and X. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000198668 | calmodulin 1 (phosphorylase kinase, delta) |
| 805 | CALM2 | This gene is a member of the calmodulin gene family. There are three distinct calmodulin genes dispersed throughout the genome that encode the identical protein, but differ at the nucleotide level. Calmodulin is a calcium binding protein that plays a role in signaling pathways, cell cycle progression and proliferation. Several infants with severe forms of long-QT syndrome (LQTS) who displayed life-threatening ventricular arrhythmias together with delayed neurodevelopment and epilepsy were found to have mutations in either this gene or another member of the calmodulin gene family (PMID:23388215). Mutations in this gene have also been identified in patients with less severe forms of LQTS (PMID:24917665), while mutations in another calmodulin gene family member have been associated with catecholaminergic polymorphic ventricular tachycardia (CPVT)(PMID:23040497), a rare disorder thought to be the cause of a significant fraction of sudden cardiac deaths in young individuals. Pseudogenes of this gene are found on chromosomes 10, 13, and 17. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000198668 | calmodulin 2 (phosphorylase kinase, delta) |
| 1277 | COL1A1 | This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIA, Ehlers-Danlos syndrome Classical type, Caffey Disease and idiopathic osteoporosis. Reciprocal translocations between chromosomes 17 and 22, where this gene and the gene for platelet-derived growth factor beta are located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans, resulting from unregulated expression of the growth factor. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | ENSG00000108821 | collagen type I alpha 1 |
| 1917 | EEF1A2 | This gene encodes an isoform of the alpha subunit of the elongation factor-1 complex, which is responsible for the enzymatic delivery of aminoacyl tRNAs to the ribosome. This isoform (alpha 2) is expressed in brain, heart and skeletal muscle, and the other isoform (alpha 1) is expressed in brain, placenta, lung, liver, kidney, and pancreas. This gene may be critical in the development of ovarian cancer. | ENSG00000101210 | eukaryotic translation elongation factor 1 alpha 2 |
| 10382 | TUBB4A | This gene encodes a member of the beta tubulin family. Beta tubulins are one of two core protein families (alpha and beta tubulins) that heterodimerize and assemble to form microtubules. Mutations in this gene cause hypomyelinating leukodystrophy-6 and autosomal dominant torsion dystonia-4. Alternate splicing results in multiple transcript variants encoding different isoforms. A pseudogene of this gene is found on chromosome X. | ENSG00000104833 | tubulin beta 4A class IVa |
| 3798 | KIF5A | This gene encodes a member of the kinesin family of proteins. Members of this family are part of a multisubunit complex that functions as a microtubule motor in intracellular organelle transport. Mutations in this gene cause autosomal dominant spastic paraplegia 10. | ENSG00000155980 | kinesin family member 5A |
| 477 | ATP1A2 | The protein encoded by this gene belongs to the family of P-type cation transport ATPases, and to the subfamily of Na+/K+ -ATPases. Na+/K+ -ATPase is an integral membrane protein responsible for establishing and maintaining the electrochemical gradients of Na and K ions across the plasma membrane. These gradients are essential for osmoregulation, for sodium-coupled transport of a variety of organic and inorganic molecules, and for electrical excitability of nerve and muscle. This enzyme is composed of two subunits, a large catalytic subunit (alpha) and a smaller glycoprotein subunit (beta). The catalytic subunit of Na+/K+ -ATPase is encoded by multiple genes. This gene encodes an alpha 2 subunit. Mutations in this gene result in familial basilar or hemiplegic migraines, and in a rare syndrome known as alternating hemiplegia of childhood. | ENSG00000018625 | ATPase Na+/K+ transporting subunit alpha 2 |
| 65009 | NDRG4 | This gene is a member of the N-myc downregulated gene family which belongs to the alpha/beta hydrolase superfamily. The protein encoded by this gene is a cytoplasmic protein that is required for cell cycle progression and survival in primary astrocytes and may be involved in the regulation of mitogenic signalling in vascular smooth muscles cells. Alternative splicing results in multiple transcripts encoding different isoforms. | ENSG00000103034 | NDRG family member 4 |
| 567 | B2M | This gene encodes a serum protein found in association with the major histocompatibility complex (MHC) class I heavy chain on the surface of nearly all nucleated cells. The protein has a predominantly beta-pleated sheet structure that can form amyloid fibrils in some pathological conditions. The encoded antimicrobial protein displays antibacterial activity in amniotic fluid. A mutation in this gene has been shown to result in hypercatabolic hypoproteinemia. | ENSG00000166710 | beta-2-microglobulin |
| 816 | CAMK2B | The product of this gene belongs to the serine/threonine protein kinase family and to the Ca(2+)/calmodulin-dependent protein kinase subfamily. Calcium signaling is crucial for several aspects of plasticity at glutamatergic synapses. In mammalian cells, the enzyme is composed of four different chains: alpha, beta, gamma, and delta. The product of this gene is a beta chain. It is possible that distinct isoforms of this chain have different cellular localizations and interact differently with calmodulin. Alternative splicing results in multiple transcript variants. | ENSG00000058404 | calcium/calmodulin dependent protein kinase II beta |
| 1152 | CKB | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in brain as well as in other tissues, and as a heterodimer with a similar muscle isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. A pseudogene of this gene has been characterized. | ENSG00000166165 | creatine kinase B |
| 57030 | SLC17A7 | The protein encoded by this gene is a vesicle-bound, sodium-dependent phosphate transporter that is specifically expressed in the neuron-rich regions of the brain. It is preferentially associated with the membranes of synaptic vesicles and functions in glutamate transport. The protein shares 82% identity with the differentiation-associated Na-dependent inorganic phosphate cotransporter and they appear to form a distinct class within the Na+/Pi cotransporter family. | ENSG00000104888 | solute carrier family 17 member 7 |
| 230 | ALDOC | This gene encodes a member of the class I fructose-biphosphate aldolase gene family. Expressed specifically in the hippocampus and Purkinje cells of the brain, the encoded protein is a glycolytic enzyme that catalyzes the reversible aldol cleavage of fructose-1,6-biphosphate and fructose 1-phosphate to dihydroxyacetone phosphate and either glyceraldehyde-3-phosphate or glyceraldehyde, respectively. | ENSG00000109107 | aldolase, fructose-bisphosphate C |
| 1292 | COL6A2 | This gene encodes one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The product of this gene contains several domains similar to von Willebrand Factor type A domains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in this gene are associated with Bethlem myopathy and Ullrich scleroatonic muscular dystrophy. Three transcript variants have been identified for this gene. | ENSG00000142173 | collagen type VI alpha 2 |
| 7447 | VSNL1 | This gene is a member of the visinin/recoverin subfamily of neuronal calcium sensor proteins. The encoded protein is strongly expressed in granule cells of the cerebellum where it associates with membranes in a calcium-dependent manner and modulates intracellular signaling pathways of the central nervous system by directly or indirectly regulating the activity of adenylyl cyclase. Alternatively spliced transcript variants have been observed, but their full-length nature has not been determined. | ENSG00000163032 | visinin like 1 |
| 1281 | COL3A1 | This gene encodes the pro-alpha1 chains of type III collagen, a fibrillar collagen that is found in extensible connective tissues such as skin, lung, uterus, intestine and the vascular system, frequently in association with type I collagen. Mutations in this gene are associated with Ehlers-Danlos syndrome types IV, and with aortic and arterial aneurysms. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | ENSG00000168542 | collagen type III alpha 1 chain |
| 3040 | HBA2 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | ENSG00000188536 | hemoglobin subunit alpha 2 |
| 6252 | RTN1 | This gene belongs to the family of reticulon encoding genes. Reticulons are associated with the endoplasmic reticulum, and are involved in neuroendocrine secretion or in membrane trafficking in neuroendocrine cells. This gene is considered to be a specific marker for neurological diseases and cancer, and is a potential molecular target for therapy. Alternative splicing results in multiple transcript variants. | ENSG00000139970 | reticulon 1 |
| 9796 | PHYHIP | NA | ENSG00000168490 | phytanoyl-CoA 2-hydroxylase interacting protein |
| 57447 | NDRG2 | This gene is a member of the N-myc downregulated gene family which belongs to the alpha/beta hydrolase superfamily. The protein encoded by this gene is a cytoplasmic protein that may play a role in neurite outgrowth. This gene may be involved in glioblastoma carcinogenesis. Several alternatively spliced transcript variants of this gene have been described, but the full-length nature of some of these variants has not been determined. | ENSG00000165795 | NDRG family member 2 |
| 3800 | KIF5C | The protein encoded by this gene is a kinesin heavy chain subunit involved in the transport of cargo within the central nervous system. The encoded protein, which acts as a tetramer by associating with another heavy chain and two light chains, interacts with protein kinase CK2. Mutations in this gene have been associated with complex cortical dysplasia with other brain malformations-2. Two transcript variants, one protein-coding and the other non-protein coding, have been found for this gene. | ENSG00000168280 | kinesin family member 5C |
| 6507 | SLC1A3 | This gene encodes a member of a member of a high affinity glutamate transporter family. This gene functions in the termination of excitatory neurotransmission in central nervous system. Mutations are associated with episodic ataxia, Type 6. Alternative splicing results in multiple transcript variants. | ENSG00000079215 | solute carrier family 1 member 3 |
| 1114 | CHGB | This gene encodes a tyrosine-sulfated secretory protein abundant in peptidergic endocrine cells and neurons. This protein may serve as a precursor for regulatory peptides. | ENSG00000089199 | chromogranin B |
| 4130 | MAP1A | This gene encodes a protein that belongs to the microtubule-associated protein family. The proteins of this family are thought to be involved in microtubule assembly, which is an essential step in neurogenesis. The product of this gene is a precursor polypeptide that presumably undergoes proteolytic processing to generate the final MAP1A heavy chain and LC2 light chain. Expression of this gene is almost exclusively in the brain. Studies of the rat microtubule-associated protein 1A gene suggested a role in early events of spinal cord development. | ENSG00000166963 | microtubule associated protein 1A |
| 2670 | GFAP | This gene encodes one of the major intermediate filament proteins of mature astrocytes. It is used as a marker to distinguish astrocytes from other glial cells during development. Mutations in this gene cause Alexander disease, a rare disorder of astrocytes in the central nervous system. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | ENSG00000131095 | glial fibrillary acidic protein |
| 6812 | STXBP1 | This gene encodes a syntaxin-binding protein. The encoded protein appears to play a role in release of neurotransmitters via regulation of syntaxin, a transmembrane attachment protein receptor. Mutations in this gene have been associated with infantile epileptic encephalopathy-4. Alternatively spliced transcript variants have been described. | ENSG00000136854 | syntaxin binding protein 1 |
| 1759 | DNM1 | This gene encodes a member of the dynamin subfamily of GTP-binding proteins. The encoded protein possesses unique mechanochemical properties used to tubulate and sever membranes, and is involved in clathrin-mediated endocytosis and other vesicular trafficking processes. Actin and other cytoskeletal proteins act as binding partners for the encoded protein, which can also self-assemble leading to stimulation of GTPase activity. More than sixty highly conserved copies of the 3’ region of this gene are found elsewhere in the genome, particularly on chromosomes Y and 15. Alternatively spliced transcript variants encoding different isoforms have been described. | ENSG00000106976 | dynamin 1 |
| 146330 | FBXL16 | Members of the F-box protein family, such as FBXL16, are characterized by an approximately 40-amino acid F-box motif. SCF complexes, formed by SKP1 (MIM 601434), cullin (see CUL1; MIM 603134), and F-box proteins, act as protein-ubiquitin ligases. F-box proteins interact with SKP1 through the F box, and they interact with ubiquitination targets through other protein interaction domains (Jin et al., 2004 [PubMed 15520277]). | ENSG00000127585 | F-box and leucine rich repeat protein 16 |
| 287 | ANK2 | This gene encodes a member of the ankyrin family of proteins that link the integral membrane proteins to the underlying spectrin-actin cytoskeleton. Ankyrins play key roles in activities such as cell motility, activation, proliferation, contact and the maintenance of specialized membrane domains. Most ankyrins are typically composed of three structural domains: an amino-terminal domain containing multiple ankyrin repeats; a central region with a highly conserved spectrin binding domain; and a carboxy-terminal regulatory domain which is the least conserved and subject to variation. The protein encoded by this gene is required for targeting and stability of Na/Ca exchanger 1 in cardiomyocytes. Mutations in this gene cause long QT syndrome 4 and cardiac arrhythmia syndrome. Multiple transcript variants encoding different isoforms have been described. | ENSG00000145362 | ankyrin 2, neuronal |
| 11075 | STMN2 | This gene encodes a member of the stathmin family of phosphoproteins. Stathmin proteins function in microtubule dynamics and signal transduction. The encoded protein plays a regulatory role in neuronal growth and is also thought to be involved in osteogenesis. Reductions in the expression of this gene have been associated with Down’s syndrome and Alzheimer’s disease. Alternatively spliced transcript variants have been observed for this gene. A pseudogene of this gene is located on the long arm of chromosome 6. | ENSG00000104435 | stathmin 2 |
| 11170 | FAM107A | NA | ENSG00000168309 | family with sequence similarity 107 member A |
| 9145 | SYNGR1 | This gene encodes an integral membrane protein associated with presynaptic vesicles in neuronal cells. The exact function of this protein is unclear, but studies of a similar murine protein suggest that it functions in synaptic plasticity without being required for synaptic transmission. The gene product belongs to the synaptogyrin gene family. Three alternatively spliced variants encoding three different isoforms have been identified. | ENSG00000100321 | synaptogyrin 1 |
| 8497 | PPFIA4 | PPFIA4, or liprin-alpha-4, belongs to the liprin-alpha gene family. See liprin-alpha-1 (LIP1, or PPFIA1; MIM 611054) for background on liprins. | ENSG00000143847 | PTPRF interacting protein alpha 4 |
| 4627 | MYH9 | This gene encodes a conventional non-muscle myosin; this protein should not be confused with the unconventional myosin-9a or 9b (MYO9A or MYO9B). The encoded protein is a myosin IIA heavy chain that contains an IQ domain and a myosin head-like domain which is involved in several important functions, including cytokinesis, cell motility and maintenance of cell shape. Defects in this gene have been associated with non-syndromic sensorineural deafness autosomal dominant type 17, Epstein syndrome, Alport syndrome with macrothrombocytopenia, Sebastian syndrome, Fechtner syndrome and macrothrombocytopenia with progressive sensorineural deafness. | ENSG00000100345 | myosin, heavy chain 9, non-muscle |
| 6620 | SNCB | This gene encodes a member of a small family of proteins that inhibit phospholipase D2 and may function in neuronal plasticity. The encoded protein is abundant in lesions of patients with Alzheimer disease. A mutation in this gene was found in individuals with dementia with Lewy bodies. Alternative splicing results in multiple transcript variants. | ENSG00000074317 | synuclein beta |
| 808 | CALM3 | NA | ENSG00000160014 | calmodulin 3 (phosphorylase kinase, delta) |
| 805 | CALM2 | This gene is a member of the calmodulin gene family. There are three distinct calmodulin genes dispersed throughout the genome that encode the identical protein, but differ at the nucleotide level. Calmodulin is a calcium binding protein that plays a role in signaling pathways, cell cycle progression and proliferation. Several infants with severe forms of long-QT syndrome (LQTS) who displayed life-threatening ventricular arrhythmias together with delayed neurodevelopment and epilepsy were found to have mutations in either this gene or another member of the calmodulin gene family (PMID:23388215). Mutations in this gene have also been identified in patients with less severe forms of LQTS (PMID:24917665), while mutations in another calmodulin gene family member have been associated with catecholaminergic polymorphic ventricular tachycardia (CPVT)(PMID:23040497), a rare disorder thought to be the cause of a significant fraction of sudden cardiac deaths in young individuals. Pseudogenes of this gene are found on chromosomes 10, 13, and 17. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000160014 | calmodulin 2 (phosphorylase kinase, delta) |
| 25789 | TMEM59L | This gene encodes a predicted type-I membrane glycoprotein. The encoded protein may play a role in functioning of the central nervous system. | ENSG00000105696 | transmembrane protein 59 like |
| 9764 | KIAA0513 | NA | ENSG00000135709 | KIAA0513 |
| 112755 | STX1B | The protein encoded by this gene belongs to a family of proteins thought to play a role in the exocytosis of synaptic vesicles. Vesicle exocytosis releases vesicular contents and is important to various cellular functions. For instance, the secretion of transmitters from neurons plays an important role in synaptic transmission. After exocytosis, the membrane and proteins from the vesicle are retrieved from the plasma membrane through the process of endocytosis. Mutations in this gene have been identified as one cause of fever-associated epilepsy syndromes. A possible link between this gene and Parkinson’s disease has also been suggested. | ENSG00000099365 | syntaxin 1B |
| 770 | CA11 | Carbonic anhydrases (CAs) are a large family of zinc metalloenzymes that catalyze the reversible hydration of carbon dioxide. They participate in a variety of biological processes, including respiration, calcification, acid-base balance, bone resorption, and the formation of aqueous humor, cerebrospinal fluid, saliva, and gastric acid. They show extensive diversity in tissue distribution and in their subcellular localization. CA XI is likely a secreted protein, however, radical changes at active site residues completely conserved in CA isozymes with catalytic activity, make it unlikely that it has carbonic anhydrase activity. It shares properties in common with two other acatalytic CA isoforms, CA VIII and CA X. CA XI is most abundantly expressed in brain, and may play a general role in the central nervous system. | ENSG00000063180 | carbonic anhydrase 11 |
| 59 | ACTA2 | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | ENSG00000107796 | actin, alpha 2, smooth muscle, aorta |
| 7057 | THBS1 | The protein encoded by this gene is a subunit of a disulfide-linked homotrimeric protein. This protein is an adhesive glycoprotein that mediates cell-to-cell and cell-to-matrix interactions. This protein can bind to fibrinogen, fibronectin, laminin, type V collagen and integrins alpha-V/beta-1. This protein has been shown to play roles in platelet aggregation, angiogenesis, and tumorigenesis. | ENSG00000137801 | thrombospondin 1 |
| 1363 | CPE | This gene encodes a member of the M14 family of metallocarboxypeptidases. The encoded preproprotein is proteolytically processed to generate the mature peptidase. This peripheral membrane protein cleaves C-terminal amino acid residues and is involved in the biosynthesis of peptide hormones and neurotransmitters, including insulin. This protein may also function independently of its peptidase activity, as a neurotrophic factor that promotes neuronal survival, and as a sorting receptor that binds to regulated secretory pathway proteins, including prohormones. Mutations in this gene are implicated in type 2 diabetes. | ENSG00000109472 | carboxypeptidase E |
| 9762 | LZTS3 | NA | ENSG00000088899 | leucine zipper, putative tumor suppressor family member 3 |
| 192683 | SCAMP5 | NA | ENSG00000198794 | secretory carrier membrane protein 5 |
| 482 | ATP1B2 | The protein encoded by this gene belongs to the family of Na+/K+ and H+/K+ ATPases beta chain proteins, and to the subfamily of Na+/K+ -ATPases. Na+/K+ -ATPase is an integral membrane protein responsible for establishing and maintaining the electrochemical gradients of Na and K ions across the plasma membrane. These gradients are essential for osmoregulation, for sodium-coupled transport of a variety of organic and inorganic molecules, and for electrical excitability of nerve and muscle. This enzyme is composed of two subunits, a large catalytic subunit (alpha) and a smaller glycoprotein subunit (beta). The beta subunit regulates, through assembly of alpha/beta heterodimers, the number of sodium pumps transported to the plasma membrane. The glycoprotein subunit of Na+/K+ -ATPase is encoded by multiple genes. This gene encodes a beta 2 subunit. Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000129244 | ATPase Na+/K+ transporting subunit beta 2 |
| 3133 | HLA-E | HLA-E belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. HLA-E binds a restricted subset of peptides derived from the leader peptides of other class I molecules. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon one encodes the leader peptide, exons 2 and 3 encode the alpha1 and alpha2 domains, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region, and exons 6 and 7 encode the cytoplasmic tail. | ENSG00000204592 | major histocompatibility complex, class I, E |
| 2026 | ENO2 | This gene encodes one of the three enolase isoenzymes found in mammals. This isoenzyme, a homodimer, is found in mature neurons and cells of neuronal origin. A switch from alpha enolase to gamma enolase occurs in neural tissue during development in rats and primates. | ENSG00000111674 | enolase 2 |
| 23542 | MAPK8IP2 | The protein encoded by this gene is closely related to MAPK8IP1/IB1/JIP-1, a scaffold protein that is involved in the c-Jun amino-terminal kinase signaling pathway. This protein is expressed in brain and pancreatic cells. It has been shown to interact with, and regulate the activity of MAPK8/JNK1, and MAP2K7/MKK7 kinases. This protein thus is thought to function as a regulator of signal transduction by protein kinase cascade in brain and pancreatic beta-cells. | ENSG00000008735 | mitogen-activated protein kinase 8 interacting protein 2 |
| 1634 | DCN | This gene encodes a member of the small leucine-rich proteoglycan family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature protein. This protein plays a role in collagen fibril assembly. Binding of this protein to multiple cell surface receptors mediates its role in tumor suppression, including a stimulatory effect on autophagy and inflammation and an inhibitory effect on angiogenesis and tumorigenesis. This gene and the related gene biglycan are thought to be the result of a gene duplication. Mutations in this gene are associated with congenital stromal corneal dystrophy in human patients. | ENSG00000011465 | decorin |
| 4313 | MMP2 | This gene is a member of the matrix metalloproteinase (MMP) gene family, that are zinc-dependent enzymes capable of cleaving components of the extracellular matrix and molecules involved in signal transduction. The protein encoded by this gene is a gelatinase A, type IV collagenase, that contains three fibronectin type II repeats in its catalytic site that allow binding of denatured type IV and V collagen and elastin. Unlike most MMP family members, activation of this protein can occur on the cell membrane. This enzyme can be activated extracellularly by proteases, or, intracellulary by its S-glutathiolation with no requirement for proteolytical removal of the pro-domain. This protein is thought to be involved in multiple pathways including roles in the nervous system, endometrial menstrual breakdown, regulation of vascularization, and metastasis. Mutations in this gene have been associated with Winchester syndrome and Nodulosis-Arthropathy-Osteolysis (NAO) syndrome. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000087245 | matrix metallopeptidase 2 |
| 302 | ANXA2 | This gene encodes a member of the annexin family. Members of this calcium-dependent phospholipid-binding protein family play a role in the regulation of cellular growth and in signal transduction pathways. This protein functions as an autocrine factor which heightens osteoclast formation and bone resorption. This gene has three pseudogenes located on chromosomes 4, 9 and 10, respectively. Multiple alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ENSG00000182718 | annexin A2 |
| 2192 | FBLN1 | Fibulin 1 is a secreted glycoprotein that becomes incorporated into a fibrillar extracellular matrix. Calcium-binding is apparently required to mediate its binding to laminin and nidogen. It mediates platelet adhesion via binding fibrinogen. Four splice variants which differ in the 3’ end have been identified. Each variant encodes a different isoform, but no functional distinctions have been identified among the four variants. | ENSG00000077942 | fibulin 1 |
| 972 | CD74 | The protein encoded by this gene associates with class II major histocompatibility complex (MHC) and is an important chaperone that regulates antigen presentation for immune response. It also serves as cell surface receptor for the cytokine macrophage migration inhibitory factor (MIF) which, when bound to the encoded protein, initiates survival pathways and cell proliferation. This protein also interacts with amyloid precursor protein (APP) and suppresses the production of amyloid beta (Abeta). Multiple alternatively spliced transcript variants encoding different isoforms have been identified. | ENSG00000019582 | CD74 molecule |
| 23095 | KIF1B | This gene encodes a motor protein that transports mitochondria and synaptic vesicle precursors. Mutations in this gene cause Charcot-Marie-Tooth disease, type 2A1. | ENSG00000054523 | kinesin family member 1B |
| 9379 | NRXN2 | This gene encodes a member of the neurexin gene family. The products of these genes function as cell adhesion molecules and receptors in the vertebrate nervous system. These genes utilize two promoters. The majority of transcripts are produced from the upstream promoter and encode alpha-neurexin isoforms while a smaller number of transcripts are produced from the downstream promoter and encode beta-neuresin isoforms. The alpha-neurexins contain epidermal growth factor-like (EGF-like) sequences and laminin G domains, and have been shown to interact with neurexophilins. The beta-neurexins lack EGF-like sequences and contain fewer laminin G domains than alpha-neurexins. Alternative splicing and the use of alternative promoters may generate thousands of transcript variants (PMID: 12036300, PMID: 11944992). | ENSG00000110076 | neurexin 2 |
| 23413 | NCS1 | This gene is a member of the neuronal calcium sensor gene family, which encode calcium-binding proteins expressed predominantly in neurons. The protein encoded by this gene regulates G protein-coupled receptor phosphorylation in a calcium-dependent manner and can substitute for calmodulin. The protein is associated with secretory granules and modulates synaptic transmission and synaptic plasticity. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000107130 | neuronal calcium sensor 1 |
| 10472 | ZBTB18 | This gene encodes a C2H2-type zinc finger protein which acts a transcriptional repressor of genes involved in neuronal development. The encoded protein recognizes a specific sequence motif and recruits components of chromatin to target genes. Alternative splicing results in multiple transcript variants. | ENSG00000179456 | zinc finger and BTB domain containing 18 |
| 79026 | AHNAK | NA | ENSG00000124942 | AHNAK nucleoprotein |
| 23362 | PSD3 | NA | ENSG00000156011 | pleckstrin and Sec7 domain containing 3 |
| 5310 | PKD1 | This gene encodes a member of the polycystin protein family. The encoded glycoprotein contains a large N-terminal extracellular region, multiple transmembrane domains and a cytoplasmic C-tail. It is an integral membrane protein that functions as a regulator of calcium permeable cation channels and intracellular calcium homoeostasis. It is also involved in cell-cell/matrix interactions and may modulate G-protein-coupled signal-transduction pathways. It plays a role in renal tubular development, and mutations in this gene cause autosomal dominant polycystic kidney disease type 1 (ADPKD1). ADPKD1 is characterized by the growth of fluid-filled cysts that replace normal renal tissue and result in end-stage renal failure. Splice variants encoding different isoforms have been noted for this gene. Also, six pseudogenes, closely linked in a known duplicated region on chromosome 16p, have been described. | ENSG00000008710 | polycystin 1, transient receptor potential channel interacting |
| 7431 | VIM | This gene encodes a member of the intermediate filament family. Intermediate filamentents, along with microtubules and actin microfilaments, make up the cytoskeleton. The protein encoded by this gene is responsible for maintaining cell shape, integrity of the cytoplasm, and stabilizing cytoskeletal interactions. It is also involved in the immune response, and controls the transport of low-density lipoprotein (LDL)-derived cholesterol from a lysosome to the site of esterification. It functions as an organizer of a number of critical proteins involved in attachment, migration, and cell signaling. Mutations in this gene causes a dominant, pulverulent cataract. | ENSG00000026025 | vimentin |
| 165 | AEBP1 | This gene encodes a member of carboxypeptidase A protein family. The encoded protein may function as a transcriptional repressor and play a role in adipogenesis and smooth muscle cell differentiation. Studies in mice suggest that this gene functions in wound healing and abdominal wall development. Overexpression of this gene is associated with glioblastoma. | ENSG00000106624 | AE binding protein 1 |
| 155066 | ATP6V0E2 | Multisubunit vacuolar-type proton pumps, or H(+)-ATPases, acidify various intracellular compartments, such as vacuoles, clathrin-coated and synaptic vesicles, endosomes, lysosomes, and chromaffin granules. H(+)-ATPases are also found in plasma membranes of specialized cells, where they play roles in urinary acidification, bone resorption, and sperm maturation. Multiple subunits form H(+)-ATPases, with proteins of the V1 class hydrolyzing ATP for energy to transport H+, and proteins of the V0 class forming an integral membrane domain through which H+ is transported. ATP6V0E2 encodes an isoform of the H(+)-ATPase V0 e subunit, an essential proton pump component (Blake-Palmer et al., 2007 [PubMed 17350184]). | ENSG00000171130 | ATPase H+ transporting V0 subunit e2 |
| 51286 | CEND1 | The protein encoded by this gene is a neuron-specific protein. The similar protein in pig enhances neuroblastoma cell differentiation in vitro and may be involved in neuronal differentiation in vivo. Multiple pseudogenes have been reported for this gene. | ENSG00000184524 | cell cycle exit and neuronal differentiation 1 |
| 7532 | YWHAG | This gene product belongs to the 14-3-3 family of proteins which mediate signal transduction by binding to phosphoserine-containing proteins. This highly conserved protein family is found in both plants and mammals, and this protein is 100% identical to the rat ortholog. It is induced by growth factors in human vascular smooth muscle cells, and is also highly expressed in skeletal and heart muscles, suggesting an important role for this protein in muscle tissue. It has been shown to interact with RAF1 and protein kinase C, proteins involved in various signal transduction pathways. | ENSG00000170027 | tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein gamma |
| 10313 | RTN3 | This gene belongs to the reticulon family of highly conserved genes that are preferentially expressed in neuroendocrine tissues. This family of proteins interact with, and modulate the activity of beta-amyloid converting enzyme 1 (BACE1), and the production of amyloid-beta. An increase in the expression of any reticulon protein substantially reduces the production of amyloid-beta, suggesting that reticulon proteins are negative modulators of BACE1 in cells. Alternatively spliced transcript variants encoding different isoforms have been found for this gene, and pseudogenes of this gene are located on chromosomes 4 and 12. | ENSG00000133318 | reticulon 3 |
| 63908 | NAPB | NA | ENSG00000125814 | NSF attachment protein beta |
| 4131 | MAP1B | This gene encodes a protein that belongs to the microtubule-associated protein family. The proteins of this family are thought to be involved in microtubule assembly, which is an essential step in neurogenesis. The product of this gene is a precursor polypeptide that presumably undergoes proteolytic processing to generate the final MAP1B heavy chain and LC1 light chain. Gene knockout studies of the mouse microtubule-associated protein 1B gene suggested an important role in development and function of the nervous system. | ENSG00000131711 | microtubule associated protein 1B |
| 114088 | TRIM9 | The protein encoded by this gene is a member of the tripartite motif (TRIM) family. The TRIM motif includes three zinc-binding domains, a RING, a B-box type 1 and a B-box type 2, and a coiled-coil region. The protein localizes to cytoplasmic bodies. Its function has not been identified. Alternate splicing of this gene generates two transcript variants encoding different isoforms. | ENSG00000100505 | tripartite motif containing 9 |
| 1742 | DLG4 | This gene encodes a member of the membrane-associated guanylate kinase (MAGUK) family. It heteromultimerizes with another MAGUK protein, DLG2, and is recruited into NMDA receptor and potassium channel clusters. These two MAGUK proteins may interact at postsynaptic sites to form a multimeric scaffold for the clustering of receptors, ion channels, and associated signaling proteins. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000132535 | discs large MAGUK scaffold protein 4 |
| 8522 | GAS7 | Growth arrest-specific 7 is expressed primarily in terminally differentiated brain cells and predominantly in mature cerebellar Purkinje neurons. GAS7 plays a putative role in neuronal development. Several transcript variants encoding proteins which vary in the N-terminus have been described. | ENSG00000007237 | growth arrest specific 7 |
| ENSG00000247556 | OIP5-AS1 | NA | ENSG00000247556 | OIP5 antisense RNA 1 |
| 7345 | UCHL1 | The protein encoded by this gene belongs to the peptidase C12 family. This enzyme is a thiol protease that hydrolyzes a peptide bond at the C-terminal glycine of ubiquitin. This gene is specifically expressed in the neurons and in cells of the diffuse neuroendocrine system. Mutations in this gene may be associated with Parkinson disease. | ENSG00000154277 | ubiquitin C-terminal hydrolase L1 |
| 9806 | SPOCK2 | This gene encodes a protein which binds with glycosaminoglycans to form part of the extracellular matrix. The protein contains thyroglobulin type-1, follistatin-like, and calcium-binding domains, and has glycosaminoglycan attachment sites in the acidic C-terminal region. Three alternatively spliced transcript variants that encode different protein isoforms have been described for this gene. | ENSG00000107742 | sparc/osteonectin, cwcv and kazal-like domains proteoglycan (testican) 2 |
| 1293 | COL6A3 | This gene encodes the alpha-3 chain, one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The alpha-3 chain of type VI collagen is much larger than the alpha-1 and -2 chains. This difference in size is largely due to an increase in the number of subdomains, similar to von Willebrand Factor type A domains, that are found in the amino terminal globular domain of all the alpha chains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in the type VI collagen genes are associated with Bethlem myopathy, a rare autosomal dominant proximal myopathy with early childhood onset. Mutations in this gene are also a cause of Ullrich congenital muscular dystrophy, also referred to as Ullrich scleroatonic muscular dystrophy, an autosomal recessive congenital myopathy that is more severe than Bethlem myopathy. Multiple transcript variants have been identified, but the full-length nature of only some of these variants has been described. | ENSG00000163359 | collagen type VI alpha 3 chain |
| 3039 | HBA1 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | ENSG00000206172 | hemoglobin subunit alpha 1 |
| 166 | AES | The protein encoded by this gene is similar in sequence to the amino terminus of Drosophila enhancer of split groucho, a protein involved in neurogenesis during embryonic development. The encoded protein, which belongs to the groucho/TLE family of proteins, can function as a homooligomer or as a heteroologimer with other family members to dominantly repress the expression of other family member genes. Three transcript variants encoding different isoforms have been found for this gene. | ENSG00000104964 | amino-terminal enhancer of split |
| 1915 | EEF1A1 | This gene encodes an isoform of the alpha subunit of the elongation factor-1 complex, which is responsible for the enzymatic delivery of aminoacyl tRNAs to the ribosome. This isoform (alpha 1) is expressed in brain, placenta, lung, liver, kidney, and pancreas, and the other isoform (alpha 2) is expressed in brain, heart and skeletal muscle. This isoform is identified as an autoantigen in 66% of patients with Felty syndrome. This gene has been found to have multiple copies on many chromosomes, some of which, if not all, represent different pseudogenes. | ENSG00000156508 | eukaryotic translation elongation factor 1 alpha 1 |
| 57476 | GRAMD1B | NA | ENSG00000023171 | GRAM domain containing 1B |
| 4185 | ADAM11 | This gene encodes a member of the ADAM (a disintegrin and metalloprotease) protein family. Members of this family are membrane-anchored proteins structurally related to snake venom disintegrins, and have been implicated in a variety of biological processes involving cell-cell and cell-matrix interactions, including fertilization, muscle development, and neurogenesis. The encoded preproprotein is proteolytically processed to generate the mature protease. This gene represents a candidate tumor suppressor gene for human breast cancer based on its location within a minimal region of chromosome 17q21 previously defined by tumor deletion mapping. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that is proteolytically processed. | ENSG00000073670 | ADAM metallopeptidase domain 11 |
| 4504 | MT3 | NA | ENSG00000087250 | metallothionein 3 |
| 23467 | NPTXR | This gene encodes a protein similar to the rat neuronal pentraxin receptor. The rat pentraxin receptor is an integral membrane protein that is thought to mediate neuronal uptake of the snake venom toxin, taipoxin, and its transport into the synapses. Studies in rat indicate that translation of this mRNA initiates at a non-AUG (CUG) codon. This may also be true for mouse and human, based on strong sequence conservation amongst these species. | ENSG00000221890 | neuronal pentraxin receptor |
| 51310 | SLC22A17 | NA | ENSG00000092096 | solute carrier family 22 member 17 |
| 4256 | MGP | The protein encoded by this gene is secreted and likely acts as an inhibitor of bone formation. The encoded protein is found in the organic matrix of bone and cartilage. Defects in this gene are a cause of Keutel syndrome (KS). Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000111341 | matrix Gla protein |
| 25999 | CLIP3 | This gene encodes a member of the cytoplasmic linker protein 170 family. Members of this protein family contain a cytoskeleton-associated protein glycine-rich domain and mediate the interaction of microtubules with cellular organelles. The encoded protein plays a role in T cell apoptosis by facilitating the association of tubulin and the lipid raft ganglioside GD3. The encoded protein also functions as a scaffold protein mediating membrane localization of phosphorylated protein kinase B. Alternatively spliced transcript variants have been observed for this gene. | ENSG00000105270 | CAP-Gly domain containing linker protein 3 |
| 9900 | SV2A | NA | ENSG00000159164 | synaptic vesicle glycoprotein 2A |
| ENSG00000225630 | MTND2P28 | NA | ENSG00000225630 | mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 2 pseudogene 28 |
| 6324 | SCN1B | Voltage-gated sodium channels are heteromeric proteins that function in the generation and propagation of action potentials in muscle and neuronal cells. They are composed of one alpha and two beta subunits, where the alpha subunit provides channel activity and the beta-1 subunit modulates the kinetics of channel inactivation. This gene encodes a sodium channel beta-1 subunit. Mutations in this gene result in generalized epilepsy with febrile seizures plus, Brugada syndrome 5, and defects in cardiac conduction. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000105711 | sodium voltage-gated channel beta subunit 1 |
| 7045 | TGFBI | This gene encodes an RGD-containing protein that binds to type I, II and IV collagens. The RGD motif is found in many extracellular matrix proteins modulating cell adhesion and serves as a ligand recognition sequence for several integrins. This protein plays a role in cell-collagen interactions and may be involved in endochondrial bone formation in cartilage. The protein is induced by transforming growth factor-beta and acts to inhibit cell adhesion. Mutations in this gene are associated with multiple types of corneal dystrophy. | ENSG00000120708 | transforming growth factor beta induced |
| 5037 | PEBP1 | This gene encodes a member of the phosphatidylethanolamine-binding family of proteins and has been shown to modulate multiple signaling pathways, including the MAP kinase (MAPK), NF-kappa B, and glycogen synthase kinase-3 (GSK-3) signaling pathways. The encoded protein can be further processed to form a smaller cleavage product, hippocampal cholinergic neurostimulating peptide (HCNP), which may be involved in neural development. This gene has been implicated in numerous human cancers and may act as a metastasis suppressor gene. Multiple pseudogenes of this gene have been identified in the genome. | ENSG00000089220 | phosphatidylethanolamine binding protein 1 |
| 9783 | RIMS3 | NA | ENSG00000117016 | regulating synaptic membrane exocytosis 3 |
| 50861 | STMN3 | This gene encodes a protein which is a member of the stathmin protein family. Members of this protein family form a complex with tubulins at a ratio of 2 tubulins for each stathmin protein. Microtubules require the ordered assembly of alpha- and beta-tubulins, and formation of a complex with stathmin disrupts microtubule formation and function. A pseudogene of this gene is located on chromosome 22. Alternative splicing results in multiple transcript variants. | ENSG00000197457 | stathmin 3 |
| 57731 | SPTBN4 | Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. This gene is one member of a family of beta-spectrin genes. The encoded protein localizes to the nuclear matrix, PML nuclear bodies, and cytoplasmic vesicles. A highly similar gene in the mouse is required for localization of specific membrane proteins in polarized regions of neurons. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000160460 | spectrin beta, non-erythrocytic 4 |
| ENSG00000237973 | MTCO1P12 | NA | ENSG00000237973 | MT-CO1 pseudogene 12 |
| 60 | ACTB | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | ENSG00000075624 | actin, beta |
| 3487 | IGFBP4 | This gene is a member of the insulin-like growth factor binding protein (IGFBP) family and encodes a protein with an IGFBP domain and a thyroglobulin type-I domain. The protein binds both insulin-like growth factors (IGFs) I and II and circulates in the plasma in both glycosylated and non-glycosylated forms. Binding of this protein prolongs the half-life of the IGFs and alters their interaction with cell surface receptors. | ENSG00000141753 | insulin like growth factor binding protein 4 |
| 116986 | AGAP2 | The protein encoded by this gene belongs to the centaurin gamma-like family. It mediates anti-apoptotic effects of nerve growth factor by activating nuclear phosphoinositide 3-kinase. It is overexpressed in cancer cells, and promotes cancer cell invasion. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | ENSG00000135439 | ArfGAP with GTPase domain, ankyrin repeat and PH domain 2 |
| 8425 | LTBP4 | The protein encoded by this gene binds transforming growth factor beta (TGFB) as it is secreted and targeted to the extracellular matrix. TGFB is biologically latent after secretion and insertion into the extracellular matrix, and sheds TGFB and other proteins upon activation. Defects in this gene may be a cause of cutis laxa and severe pulmonary, gastrointestinal, and urinary abnormalities. Three transcript variants encoding different isoforms have been found for this gene. | ENSG00000090006 | latent transforming growth factor beta binding protein 4 |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",5,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[6,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| X_id | name | summary | symbol | query |
|---|---|---|---|---|
| 3043 | hemoglobin subunit beta | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | HBB | ENSG00000244734 |
| 3039 | hemoglobin subunit alpha 1 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | HBA1 | ENSG00000206172 |
| 1277 | collagen type I alpha 1 | This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIA, Ehlers-Danlos syndrome Classical type, Caffey Disease and idiopathic osteoporosis. Reciprocal translocations between chromosomes 17 and 22, where this gene and the gene for platelet-derived growth factor beta are located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans, resulting from unregulated expression of the growth factor. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | COL1A1 | ENSG00000108821 |
| 6711 | spectrin beta, non-erythrocytic 1 | Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. This gene is one member of a family of beta-spectrin genes. The encoded protein contains an N-terminal actin-binding domain, and 17 spectrin repeats which are involved in dimer formation. Multiple transcript variants encoding different isoforms have been found for this gene. | SPTBN1 | ENSG00000115306 |
| 6280 | S100 calcium binding protein A9 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and altered expression of this protein is associated with the disease cystic fibrosis. This antimicrobial protein exhibits antifungal and antibacterial activity. | S100A9 | ENSG00000163220 |
| 3848 | keratin 1 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | KRT1 | ENSG00000167768 |
| 8404 | SPARC like 1 | NA | SPARCL1 | ENSG00000152583 |
| 2335 | fibronectin 1 | This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | FN1 | ENSG00000115414 |
| 3040 | hemoglobin subunit alpha 2 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | HBA2 | ENSG00000188536 |
| 6678 | secreted protein acidic and cysteine rich | This gene encodes a cysteine-rich acidic matrix-associated protein. The encoded protein is required for the collagen in bone to become calcified but is also involved in extracellular matrix synthesis and promotion of changes to cell shape. The gene product has been associated with tumor suppression but has also been correlated with metastasis based on changes to cell shape which can promote tumor cell invasion. Three transcript variants encoding different isoforms have been found for this gene. | SPARC | ENSG00000113140 |
| 975 | CD81 molecule | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. This encoded protein is a cell surface glycoprotein that is known to complex with integrins. This protein appears to promote muscle cell fusion and support myotube maintenance. Also it may be involved in signal transduction. This gene is localized in the tumor-suppressor gene region and thus it is a candidate gene for malignancies. Two transcript variants encoding different isoforms have been found for this gene. | CD81 | ENSG00000110651 |
| 4155 | myelin basic protein | The protein encoded by the classic MBP gene is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. However, MBP-related transcripts are also present in the bone marrow and the immune system. These mRNAs arise from the long MBP gene (otherwise called ‘Golli-MBP’) that contains 3 additional exons located upstream of the classic MBP exons. Alternative splicing from the Golli and the MBP transcription start sites gives rise to 2 sets of MBP-related transcripts and gene products. The Golli mRNAs contain 3 exons unique to Golli-MBP, spliced in-frame to 1 or more MBP exons. They encode hybrid proteins that have N-terminal Golli aa sequence linked to MBP aa sequence. The second family of transcripts contain only MBP exons and produce the well characterized myelin basic proteins. This complex gene structure is conserved among species suggesting that the MBP transcription unit is an integral part of the Golli transcription unit and that this arrangement is important for the function and/or regulation of these genes. | MBP | ENSG00000197971 |
| 11167 | follistatin like 1 | This gene encodes a protein with similarity to follistatin, an activin-binding protein. It contains an FS module, a follistatin-like sequence containing 10 conserved cysteine residues. This gene product is thought to be an autoantigen associated with rheumatoid arthritis. | FSTL1 | ENSG00000163430 |
| 2034 | endothelial PAS domain protein 1 | This gene encodes a transcription factor involved in the induction of genes regulated by oxygen, which is induced as oxygen levels fall. The encoded protein contains a basic-helix-loop-helix domain protein dimerization domain as well as a domain found in proteins in signal transduction pathways which respond to oxygen levels. Mutations in this gene are associated with erythrocytosis familial type 4. | EPAS1 | ENSG00000116016 |
| 1278 | collagen type I alpha 2 chain | This gene encodes the pro-alpha2 chain of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIB, recessive Ehlers-Danlos syndrome Classical type, idiopathic osteoporosis, and atypical Marfan syndrome. Symptoms associated with mutations in this gene, however, tend to be less severe than mutations in the gene for the alpha1 chain of type I collagen (COL1A1) reflecting the different role of alpha2 chains in matrix integrity. Three transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | COL1A2 | ENSG00000164692 |
| 4035 | LDL receptor related protein 1 | This gene encodes a member of the low-density lipoprotein receptor family of proteins. The encoded preproprotein is proteolytically processed by furin to generate 515 kDa and 85 kDa subunits that form the mature receptor (PMID: 8546712). This receptor is involved in several cellular processes, including intracellular signaling, lipid homeostasis, and clearance of apoptotic cells. In addition, the encoded protein is necessary for the alpha 2-macroglobulin-mediated clearance of secreted amyloid precursor protein and beta-amyloid, the main component of amyloid plaques found in Alzheimer patients. Expression of this gene decreases with age and has been found to be lower than controls in brain tissue from Alzheimer’s disease patients. | LRP1 | ENSG00000123384 |
| 2 | alpha-2-macroglobulin | Alpha-2-macroglobulin is a protease inhibitor and cytokine transporter. It inhibits many proteases, including trypsin, thrombin and collagenase. A2M is implicated in Alzheimer disease (AD) due to its ability to mediate the clearance and degradation of A-beta, the major component of beta-amyloid deposits. | A2M | ENSG00000175899 |
| 1490 | connective tissue growth factor | The protein encoded by this gene is a mitogen that is secreted by vascular endothelial cells. The encoded protein plays a role in chondrocyte proliferation and differentiation, cell adhesion in many cell types, and is related to platelet-derived growth factor. Certain polymorphisms in this gene have been linked with a higher incidence of systemic sclerosis. | CTGF | ENSG00000118523 |
| 3858 | keratin 10 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | KRT10 | ENSG00000186395 |
| 51629 | solute carrier family 25 member 39 | This gene encodes a member of the SLC25 transporter or mitochondrial carrier family of proteins. Members of this family are encoded by the nuclear genome while their protein products are usually embedded in the inner mitochondrial membrane and exhibit wide-ranging substrate specificity. Although the encoded protein is currently considered an orphan transporter, this protein is related to other carriers known to transport amino acids. This protein may play a role in iron homeostasis. | SLC25A39 | ENSG00000013306 |
| 1465 | cysteine and glycine rich protein 1 | This gene encodes a member of the cysteine-rich protein (CSRP) family. This gene family includes a group of LIM domain proteins, which may be involved in regulatory processes important for development and cellular differentiation. The LIM/double zinc-finger motif found in this gene product occurs in proteins with critical functions in gene regulation, cell growth, and somatic differentiation. Alternatively spliced transcript variants have been described. | CSRP1 | ENSG00000159176 |
| 351 | amyloid beta precursor protein | This gene encodes a cell surface receptor and transmembrane precursor protein that is cleaved by secretases to form a number of peptides. Some of these peptides are secreted and can bind to the acetyltransferase complex APBB1/TIP60 to promote transcriptional activation, while others form the protein basis of the amyloid plaques found in the brains of patients with Alzheimer disease. In addition, two of the peptides are antimicrobial peptides, having been shown to have bacteriocidal and antifungal activities. Mutations in this gene have been implicated in autosomal dominant Alzheimer disease and cerebroarterial amyloidosis (cerebral amyloid angiopathy). Multiple transcript variants encoding several different isoforms have been found for this gene. | APP | ENSG00000142192 |
| ENSG00000225630 | mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 2 pseudogene 28 | NA | MTND2P28 | ENSG00000225630 |
| 2037 | erythrocyte membrane protein band 4.1 like 2 | NA | EPB41L2 | ENSG00000079819 |
| 8531 | Y-box binding protein 3 | NA | YBX3 | ENSG00000060138 |
| 1266 | calponin 3 | This gene encodes a protein with a markedly acidic C terminus; the basic N-terminus is highly homologous to the N-terminus of a related gene, CNN1. Members of the CNN gene family all contain similar tandemly repeated motifs. This encoded protein is associated with the cytoskeleton but is not involved in contraction. | CNN3 | ENSG00000117519 |
| 2512 | ferritin, light polypeptide | This gene encodes the light subunit of the ferritin protein. Ferritin is the major intracellular iron storage protein in prokaryotes and eukaryotes. It is composed of 24 subunits of the heavy and light ferritin chains. Variation in ferritin subunit composition may affect the rates of iron uptake and release in different tissues. A major function of ferritin is the storage of iron in a soluble and nontoxic state. Defects in this light chain ferritin gene are associated with several neurodegenerative diseases and hyperferritinemia-cataract syndrome. This gene has multiple pseudogenes. | FTL | ENSG00000087086 |
| 5159 | platelet derived growth factor receptor beta | This gene encodes a cell surface tyrosine kinase receptor for members of the platelet-derived growth factor family. These growth factors are mitogens for cells of mesenchymal origin. The identity of the growth factor bound to a receptor monomer determines whether the functional receptor is a homodimer or a heterodimer, composed of both platelet-derived growth factor receptor alpha and beta polypeptides. This gene is flanked on chromosome 5 by the genes for granulocyte-macrophage colony-stimulating factor and macrophage-colony stimulating factor receptor; all three genes may be implicated in the 5-q syndrome. A translocation between chromosomes 5 and 12, that fuses this gene to that of the translocation, ETV6, leukemia gene, results in chronic myeloproliferative disorder with eosinophilia. | PDGFRB | ENSG00000113721 |
| 3959 | galectin 3 binding protein | The galectins are a family of beta-galactoside-binding proteins implicated in modulating cell-cell and cell-matrix interactions. LGALS3BP has been found elevated in the serum of patients with cancer and in those infected by the human immunodeficiency virus (HIV). It appears to be implicated in immune response associated with natural killer (NK) and lymphokine-activated killer (LAK) cell cytotoxicity. Using fluorescence in situ hybridization the full length 90K cDNA has been localized to chromosome 17q25. The native protein binds specifically to a human macrophage-associated lectin known as Mac-2 and also binds galectin 1. | LGALS3BP | ENSG00000108679 |
| 805 | calmodulin 2 (phosphorylase kinase, delta) | This gene is a member of the calmodulin gene family. There are three distinct calmodulin genes dispersed throughout the genome that encode the identical protein, but differ at the nucleotide level. Calmodulin is a calcium binding protein that plays a role in signaling pathways, cell cycle progression and proliferation. Several infants with severe forms of long-QT syndrome (LQTS) who displayed life-threatening ventricular arrhythmias together with delayed neurodevelopment and epilepsy were found to have mutations in either this gene or another member of the calmodulin gene family (PMID:23388215). Mutations in this gene have also been identified in patients with less severe forms of LQTS (PMID:24917665), while mutations in another calmodulin gene family member have been associated with catecholaminergic polymorphic ventricular tachycardia (CPVT)(PMID:23040497), a rare disorder thought to be the cause of a significant fraction of sudden cardiac deaths in young individuals. Pseudogenes of this gene are found on chromosomes 10, 13, and 17. Alternative splicing results in multiple transcript variants encoding different isoforms. | CALM2 | ENSG00000143933 |
| 23313 | KIAA0930 | NA | KIAA0930 | ENSG00000100364 |
| 1152 | creatine kinase B | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in brain as well as in other tissues, and as a heterodimer with a similar muscle isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. A pseudogene of this gene has been characterized. | CKB | ENSG00000166165 |
| 1293 | collagen type VI alpha 3 chain | This gene encodes the alpha-3 chain, one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The alpha-3 chain of type VI collagen is much larger than the alpha-1 and -2 chains. This difference in size is largely due to an increase in the number of subdomains, similar to von Willebrand Factor type A domains, that are found in the amino terminal globular domain of all the alpha chains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in the type VI collagen genes are associated with Bethlem myopathy, a rare autosomal dominant proximal myopathy with early childhood onset. Mutations in this gene are also a cause of Ullrich congenital muscular dystrophy, also referred to as Ullrich scleroatonic muscular dystrophy, an autosomal recessive congenital myopathy that is more severe than Bethlem myopathy. Multiple transcript variants have been identified, but the full-length nature of only some of these variants has been described. | COL6A3 | ENSG00000163359 |
| 7038 | thyroglobulin | Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. | TG | ENSG00000042832 |
| 1471 | cystatin C | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins and the kininogens. The type 2 cystatin proteins are a class of cysteine proteinase inhibitors found in a variety of human fluids and secretions, where they appear to provide protective functions. The cystatin locus on chromosome 20 contains the majority of the type 2 cystatin genes and pseudogenes. This gene is located in the cystatin locus and encodes the most abundant extracellular inhibitor of cysteine proteases, which is found in high concentrations in biological fluids and is expressed in virtually all organs of the body. A mutation in this gene has been associated with amyloid angiopathy. Expression of this protein in vascular wall smooth muscle cells is severely reduced in both atherosclerotic and aneurysmal aortic lesions, establishing its role in vascular disease. In addition, this protein has been shown to have an antimicrobial function, inhibiting the replication of herpes simplex virus. Alternative splicing results in multiple transcript variants encoding a single protein. | CST3 | ENSG00000101439 |
| 5376 | peripheral myelin protein 22 | This gene encodes an integral membrane protein that is a major component of myelin in the peripheral nervous system. Studies suggest two alternately used promoters drive tissue-specific expression. Various mutations of this gene are causes of Charcot-Marie-Tooth disease Type IA, Dejerine-Sottas syndrome, and hereditary neuropathy with liability to pressure palsies. Alternative splicing results in multiple transcript variants. | PMP22 | ENSG00000109099 |
| 3320 | heat shock protein 90kDa alpha family class A member 1 | The protein encoded by this gene is an inducible molecular chaperone that functions as a homodimer. The encoded protein aids in the proper folding of specific target proteins by use of an ATPase activity that is modulated by co-chaperones. Two transcript variants encoding different isoforms have been found for this gene. | HSP90AA1 | ENSG00000080824 |
| 3106 | major histocompatibility complex, class I, B | HLA-B belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. Class I molecules play a central role in the immune system by presenting peptides derived from the endoplasmic reticulum lumen. They are expressed in nearly all cells. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon 1 encodes the leader peptide, exon 2 and 3 encode the alpha1 and alpha2 domains, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region and exons 6 and 7 encode the cytoplasmic tail. Polymorphisms within exon 2 and exon 3 are responsible for the peptide binding specificity of each class one molecule. Typing for these polymorphisms is routinely done for bone marrow and kidney transplantation. Hundreds of HLA-B alleles have been described. | HLA-B | ENSG00000234745 |
| 219654 | zinc finger CCHC-type containing 24 | NA | ZCCHC24 | ENSG00000165424 |
| 3912 | laminin subunit beta 1 | Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Laminins are composed of 3 non identical chains: laminin alpha, beta and gamma (formerly A, B1, and B2, respectively) and they form a cruciform structure consisting of 3 short arms, each formed by a different chain, and a long arm composed of all 3 chains. Each laminin chain is a multidomain protein encoded by a distinct gene. Several isoforms of each chain have been described. Different alpha, beta and gamma chain isomers combine to give rise to different heterotrimeric laminin isoforms which are designated by Arabic numerals in the order of their discovery, i.e. alpha1beta1gamma1 heterotrimer is laminin 1. The biological functions of the different chains and trimer molecules are largely unknown, but some of the chains have been shown to differ with respect to their tissue distribution, presumably reflecting diverse functions in vivo. This gene encodes the beta chain isoform laminin, beta 1. The beta 1 chain has 7 structurally distinct domains which it shares with other beta chain isomers. The C-terminal helical region containing domains I and II are separated by domain alpha, domains III and V contain several EGF-like repeats, and domains IV and VI have a globular conformation. Laminin, beta 1 is expressed in most tissues that produce basement membranes, and is one of the 3 chains constituting laminin 1, the first laminin isolated from Engelbreth-Holm-Swarm (EHS) tumor. A sequence in the beta 1 chain that is involved in cell attachment, chemotaxis, and binding to the laminin receptor was identified and shown to have the capacity to inhibit metastasis. | LAMB1 | ENSG00000091136 |
| 4735 | septin 2 | NA | SEPT2 | ENSG00000168385 |
| 7267 | tetratricopeptide repeat domain 3 | NA | TTC3 | ENSG00000182670 |
| 151887 | coiled-coil domain containing 80 | NA | CCDC80 | ENSG00000091986 |
| 7311 | ubiquitin A-52 residue ribosomal protein fusion product 1 | Ubiquitin is a highly conserved nuclear and cytoplasmic protein that has a major role in targeting cellular proteins for degradation by the 26S proteosome. It is also involved in the maintenance of chromatin structure, the regulation of gene expression, and the stress response. Ubiquitin is synthesized as a precursor protein consisting of either polyubiquitin chains or a single ubiquitin moiety fused to an unrelated protein. This gene encodes a fusion protein consisting of ubiquitin at the N terminus and ribosomal protein L40 at the C terminus, a C-terminal extension protein (CEP). Multiple processed pseudogenes derived from this gene are present in the genome. | UBA52 | ENSG00000221983 |
| 8522 | growth arrest specific 7 | Growth arrest-specific 7 is expressed primarily in terminally differentiated brain cells and predominantly in mature cerebellar Purkinje neurons. GAS7 plays a putative role in neuronal development. Several transcript variants encoding proteins which vary in the N-terminus have been described. | GAS7 | ENSG00000007237 |
| 7805 | lysosomal protein transmembrane 5 | This gene encodes a transmembrane receptor that is associated with lysosomes. The encoded protein, also known as E3 protein, may play a role in hematopoiesis. | LAPTM5 | ENSG00000162511 |
| 9839 | zinc finger E-box binding homeobox 2 | The protein encoded by this gene is a member of the Zfh1 family of 2-handed zinc finger/homeodomain proteins. It is located in the nucleus and functions as a DNA-binding transcriptional repressor that interacts with activated SMADs. Mutations in this gene are associated with Hirschsprung disease/Mowat-Wilson syndrome. Alternatively spliced transcript variants have been found for this gene. | ZEB2 | ENSG00000169554 |
| 667 | dystonin | This gene encodes a member of the plakin protein family of adhesion junction plaque proteins. Multiple alternatively spliced transcript variants encoding distinct isoforms have been found for this gene, but the full-length nature of some variants has not been defined. It has been reported that some isoforms are expressed in neural and muscle tissue, anchoring neural intermediate filaments to the actin cytoskeleton, and some isoforms are expressed in epithelial tissue, anchoring keratin-containing intermediate filaments to hemidesmosomes. Consistent with the expression, mice defective for this gene show skin blistering and neurodegeneration. | DST | ENSG00000151914 |
| 64423 | inverted formin, FH2 and WH2 domain containing | This gene represents a member of the formin family of proteins. It is considered a diaphanous formin due to the presence of a diaphanous inhibitory domain located at the N-terminus of the encoded protein. Studies of a similar mouse protein indicate that the protein encoded by this locus may function in polymerization and depolymerization of actin filaments. Mutations at this locus have been associated with focal segmental glomerulosclerosis 5. | INF2 | ENSG00000203485 |
| 1307 | collagen type XVI alpha 1 chain | This gene encodes the alpha chain of type XVI collagen, a member of the FACIT collagen family (fibril-associated collagens with interrupted helices). Members of this collagen family are found in association with fibril-forming collagens such as type I and II, and serve to maintain the integrity of the extracellular matrix. High levels of type XVI collagen have been found in fibroblasts and keratinocytes, and in smooth muscle and amnion. | COL16A1 | ENSG00000084636 |
| 1191 | clusterin | The protein encoded by this gene is a secreted chaperone that can under some stress conditions also be found in the cell cytosol. It has been suggested to be involved in several basic biological events such as cell death, tumor progression, and neurodegenerative disorders. Alternate splicing results in both coding and non-coding variants. | CLU | ENSG00000120885 |
| 3572 | interleukin 6 signal transducer | The protein encoded by this gene is a signal transducer shared by many cytokines, including interleukin 6 (IL6), ciliary neurotrophic factor (CNTF), leukemia inhibitory factor (LIF), and oncostatin M (OSM). This protein functions as a part of the cytokine receptor complex. The activation of this protein is dependent upon the binding of cytokines to their receptors. vIL6, a protein related to IL6 and encoded by the Kaposi sarcoma-associated herpesvirus, can bypass the interleukin 6 receptor (IL6R) and directly activate this protein. Knockout studies in mice suggest that this gene plays a critical role in regulating myocyte apoptosis. Alternatively spliced transcript variants have been described. A related pseudogene has been identified on chromosome 17. | IL6ST | ENSG00000134352 |
| 10313 | reticulon 3 | This gene belongs to the reticulon family of highly conserved genes that are preferentially expressed in neuroendocrine tissues. This family of proteins interact with, and modulate the activity of beta-amyloid converting enzyme 1 (BACE1), and the production of amyloid-beta. An increase in the expression of any reticulon protein substantially reduces the production of amyloid-beta, suggesting that reticulon proteins are negative modulators of BACE1 in cells. Alternatively spliced transcript variants encoding different isoforms have been found for this gene, and pseudogenes of this gene are located on chromosomes 4 and 12. | RTN3 | ENSG00000133318 |
| 821 | calnexin | This gene encodes a member of the calnexin family of molecular chaperones. The encoded protein is a calcium-binding, endoplasmic reticulum (ER)-associated protein that interacts transiently with newly synthesized N-linked glycoproteins, facilitating protein folding and assembly. It may also play a central role in the quality control of protein folding by retaining incorrectly folded protein subunits within the ER for degradation. Alternatively spliced transcript variants encoding the same protein have been described. | CANX | ENSG00000127022 |
| 216 | aldehyde dehydrogenase 1 family member A1 | The protein encoded by this gene belongs to the aldehyde dehydrogenase family. Aldehyde dehydrogenase is the next enzyme after alcohol dehydrogenase in the major pathway of alcohol metabolism. There are two major aldehyde dehydrogenase isozymes in the liver, cytosolic and mitochondrial, which are encoded by distinct genes, and can be distinguished by their electrophoretic mobility, kinetic properties, and subcellular localization. This gene encodes the cytosolic isozyme. Studies in mice show that through its role in retinol metabolism, this gene may also be involved in the regulation of the metabolic responses to high-fat diet. | ALDH1A1 | ENSG00000165092 |
| 4060 | lumican | This gene encodes a member of the small leucine-rich proteoglycan (SLRP) family that includes decorin, biglycan, fibromodulin, keratocan, epiphycan, and osteoglycin. In these bifunctional molecules, the protein moiety binds collagen fibrils and the highly charged hydrophilic glycosaminoglycans regulate interfibrillar spacings. Lumican is the major keratan sulfate proteoglycan of the cornea but is also distributed in interstitial collagenous matrices throughout the body. Lumican may regulate collagen fibril organization and circumferential growth, corneal transparency, and epithelial cell migration and tissue repair. | LUM | ENSG00000139329 |
| 53826 | FXYD domain containing ion transport regulator 6 | This gene encodes a member of the FXYD family of transmembrane proteins. This particular protein encodes phosphohippolin, which likely affects the activity of Na,K-ATPase. Multiple alternatively spliced transcript variants encoding the same protein have been described. Related pseudogenes have been identified on chromosomes 10 and X. Read-through transcripts have been observed between this locus and the downstream sodium/potassium-transporting ATPase subunit gamma (FXYD2, GeneID 486) locus. | FXYD6 | ENSG00000137726 |
| 3860 | keratin 13 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | KRT13 | ENSG00000171401 |
| 11031 | RAB31, member RAS oncogene family | Small GTP-binding proteins of the RAB family, such as RAB31, play essential roles in vesicle and granule targeting (Bao et al., 2002 [PubMed 11784320]). | RAB31 | ENSG00000168461 |
| 54541 | DNA damage inducible transcript 4 | NA | DDIT4 | ENSG00000168209 |
| 57447 | NDRG family member 2 | This gene is a member of the N-myc downregulated gene family which belongs to the alpha/beta hydrolase superfamily. The protein encoded by this gene is a cytoplasmic protein that may play a role in neurite outgrowth. This gene may be involved in glioblastoma carcinogenesis. Several alternatively spliced transcript variants of this gene have been described, but the full-length nature of some of these variants has not been determined. | NDRG2 | ENSG00000165795 |
| 54587 | matrix remodeling associated 8 | NA | MXRA8 | ENSG00000162576 |
| 10160 | FERM, ARH/RhoGEF and pleckstrin domain protein 1 | This gene encodes a protein containing a FERM (4.2, exrin, radixin, moesin) domain, a Dbl homology domain, and two pleckstrin homology domains. These domains are found in guanine nucleotide exchange factors and proteins that link the cytoskeleton to the cell membrane. The encoded protein functions in neurons to promote dendritic growth. Alternative splicing results in multiple transcript variants. | FARP1 | ENSG00000152767 |
| 7184 | heat shock protein 90kDa beta family member 1 | This gene encodes a member of a family of adenosine triphosphate(ATP)-metabolizing molecular chaperones with roles in stabilizing and folding other proteins. The encoded protein is localized to melanosomes and the endoplasmic reticulum. Expression of this protein is associated with a variety of pathogenic states, including tumor formation. There is a microRNA gene located within the 5’ exon of this gene. There are pseudogenes for this gene on chromosomes 1 and 15. | HSP90B1 | ENSG00000166598 |
| 57608 | KIAA1462 | NA | KIAA1462 | ENSG00000165757 |
| 1289 | collagen type V alpha 1 | This gene encodes an alpha chain for one of the low abundance fibrillar collagens. Fibrillar collagen molecules are trimers that can be composed of one or more types of alpha chains. Type V collagen is found in tissues containing type I collagen and appears to regulate the assembly of heterotypic fibers composed of both type I and type V collagen. This gene product is closely related to type XI collagen and it is possible that the collagen chains of types V and XI constitute a single collagen type with tissue-specific chain combinations. The encoded procollagen protein occurs commonly as the heterotrimer pro-alpha1(V)-pro-alpha1(V)-pro-alpha2(V). Mutations in this gene are associated with Ehlers-Danlos syndrome, types I and II. Alternative splicing of this gene results in multiple transcript variants. | COL5A1 | ENSG00000130635 |
| 9782 | matrin 3 | This gene encodes a nuclear matrix protein, which is proposed to stabilize certain messenger RNA species. Mutations of this gene are associated with distal myopathy 2, which often includes vocal cord and pharyngeal weakness. Alternatively spliced transcript variants, including read-through transcripts composed of the upstream small nucleolar RNA host gene 4 (non-protein coding) and matrin 3 gene sequence, have been identified. Pseudogenes of this gene are located on chromosomes 1 and X. | MATR3 | ENSG00000015479 |
| 9590 | A-kinase anchoring protein 12 | The A-kinase anchor proteins (AKAPs) are a group of structurally diverse proteins, which have the common function of binding to the regulatory subunit of protein kinase A (PKA) and confining the holoenzyme to discrete locations within the cell. This gene encodes a member of the AKAP family. The encoded protein is expressed in endothelial cells, cultured fibroblasts, and osteosarcoma cells. It associates with protein kinases A and C and phosphatase, and serves as a scaffold protein in signal transduction. This protein and RII PKA colocalize at the cell periphery. This protein is a cell growth-related protein. Antibodies to this protein can be produced by patients with myasthenia gravis. Alternative splicing of this gene results in two transcript variants encoding different isoforms. | AKAP12 | ENSG00000131016 |
| 79026 | AHNAK nucleoprotein | NA | AHNAK | ENSG00000124942 |
| 1281 | collagen type III alpha 1 chain | This gene encodes the pro-alpha1 chains of type III collagen, a fibrillar collagen that is found in extensible connective tissues such as skin, lung, uterus, intestine and the vascular system, frequently in association with type I collagen. Mutations in this gene are associated with Ehlers-Danlos syndrome types IV, and with aortic and arterial aneurysms. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | COL3A1 | ENSG00000168542 |
| 9902 | mannose receptor C type 2 | This gene encodes a member of the mannose receptor family of proteins that contain a fibronectin type II domain and multiple C-type lectin-like domains. The encoded protein plays a role in extracellular matrix remodeling by mediating the internalization and lysosomal degradation of collagen ligands. Expression of this gene may play a role in the tumorigenesis and metastasis of several malignancies including breast cancer, gliomas and metastatic bone disease. | MRC2 | ENSG00000011028 |
| 3672 | integrin subunit alpha 1 | This gene encodes the alpha 1 subunit of integrin receptors. This protein heterodimerizes with the beta 1 subunit to form a cell-surface receptor for collagen and laminin. The heterodimeric receptor is involved in cell-cell adhesion and may play a role in inflammation and fibrosis. The alpha 1 subunit contains an inserted (I) von Willebrand factor type I domain which is thought to be involved in collagen binding. | ITGA1 | ENSG00000213949 |
| 51312 | solute carrier family 25 member 37 | SLC25A37 is a solute carrier localized in the mitochondrial inner membrane. It functions as an essential iron importer for the synthesis of mitochondrial heme and iron-sulfur clusters (summary by Chen et al., 2009 [PubMed 19805291]). | SLC25A37 | ENSG00000147454 |
| 1284 | collagen type IV alpha 2 | This gene encodes one of the six subunits of type IV collagen, the major structural component of basement membranes. The C-terminal portion of the protein, known as canstatin, is an inhibitor of angiogenesis and tumor growth. Like the other members of the type IV collagen gene family, this gene is organized in a head-to-head conformation with another type IV collagen gene so that each gene pair shares a common promoter. | COL4A2 | ENSG00000134871 |
| 55544 | RNA binding motif protein 38 | NA | RBM38 | ENSG00000132819 |
| 643314 | KIAA0754 | NA | KIAA0754 | ENSG00000127603 |
| 23499 | microtubule-actin crosslinking factor 1 | This gene encodes a large protein containing numerous spectrin and leucine-rich repeat (LRR) domains. The encoded protein is a member of a family of proteins that form bridges between different cytoskeletal elements. This protein facilitates actin-microtubule interactions at the cell periphery and couples the microtubule network to cellular junctions. Alternative splicing results in multiple transcript variants, but the full-length nature of some of these variants has not been determined. | MACF1 | ENSG00000127603 |
| 7009 | transmembrane BAX inhibitor motif containing 6 | NA | TMBIM6 | ENSG00000139644 |
| 1634 | decorin | This gene encodes a member of the small leucine-rich proteoglycan family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature protein. This protein plays a role in collagen fibril assembly. Binding of this protein to multiple cell surface receptors mediates its role in tumor suppression, including a stimulatory effect on autophagy and inflammation and an inhibitory effect on angiogenesis and tumorigenesis. This gene and the related gene biglycan are thought to be the result of a gene duplication. Mutations in this gene are associated with congenital stromal corneal dystrophy in human patients. | DCN | ENSG00000011465 |
| 301 | annexin A1 | This gene encodes a membrane-localized protein that binds phospholipids. This protein inhibits phospholipase A2 and has anti-inflammatory activity. Loss of function or expression of this gene has been detected in multiple tumors. | ANXA1 | ENSG00000135046 |
| 2202 | EGF containing fibulin like extracellular matrix protein 1 | This gene encodes a member of the fibulin family of extracellular matrix glycoproteins. Like all members of this family, the encoded protein contains tandemly repeated epidermal growth factor-like repeats followed by a C-terminus fibulin-type domain. This gene is upregulated in malignant gliomas and may play a role in the aggressive nature of these tumors. Mutations in this gene are associated with Doyne honeycomb retinal dystrophy. Alternatively spliced transcript variants that encode the same protein have been described. | EFEMP1 | ENSG00000115380 |
| 1282 | collagen type IV alpha 1 chain | This gene encodes a type IV collagen alpha protein. Type IV collagen proteins are integral components of basement membranes. This gene shares a bidirectional promoter with a paralogous gene on the opposite strand. The protein consists of an amino-terminal 7S domain, a triple-helix forming collagenous domain, and a carboxy-terminal non-collagenous domain. It functions as part of a heterotrimer and interacts with other extracellular matrix components such as perlecans, proteoglycans, and laminins. In addition, proteolytic cleavage of the non-collagenous carboxy-terminal domain results in a biologically active fragment known as arresten, which has anti-angiogenic and tumor suppressor properties. Mutations in this gene cause porencephaly, cerebrovascular disease, and renal and muscular defects. Alternative splicing results in multiple transcript variants. | COL4A1 | ENSG00000187498 |
| 58 | actin, alpha 1, skeletal muscle | The product encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Mutations in this gene cause nemaline myopathy type 3, congenital myopathy with excess of thin myofilaments, congenital myopathy with cores, and congenital myopathy with fiber-type disproportion, diseases that lead to muscle fiber defects. | ACTA1 | ENSG00000143632 |
| 5730 | prostaglandin D2 synthase | The protein encoded by this gene is a glutathione-independent prostaglandin D synthase that catalyzes the conversion of prostaglandin H2 (PGH2) to postaglandin D2 (PGD2). PGD2 functions as a neuromodulator as well as a trophic factor in the central nervous system. PGD2 is also involved in smooth muscle contraction/relaxation and is a potent inhibitor of platelet aggregation. This gene is preferentially expressed in brain. Studies with transgenic mice overexpressing this gene suggest that this gene may be also involved in the regulation of non-rapid eye movement sleep. | PTGDS | ENSG00000107317 |
| 5064 | paralemmin | This gene encodes a member of the paralemmin protein family. The product of this gene is a prenylated and palmitoylated phosphoprotein that associates with the cytoplasmic face of plasma membranes and is implicated in plasma membrane dynamics in neurons and other cell types. Several alternatively spliced transcript variants have been identified, but the full-length nature of only two transcript variants has been determined. | PALM | ENSG00000099864 |
| 9639 | Rho guanine nucleotide exchange factor 10 | This gene encodes a Rho guanine nucleotide exchange factor (GEF). Rho GEFs regulate the activity of small Rho GTPases by stimulating the exchange of guanine diphosphate (GDP) for guanine triphosphate (GTP) and may play a role in neural morphogenesis. Mutations in this gene are associated with slowed nerve conduction velocity (SNCV). Alternative splicing results in multiple transcript variants. | ARHGEF10 | ENSG00000104728 |
| 1809 | dihydropyrimidinase like 3 | NA | DPYSL3 | ENSG00000113657 |
| 813 | calumenin | The product of this gene is a calcium-binding protein localized in the endoplasmic reticulum (ER) and it is involved in such ER functions as protein folding and sorting. This protein belongs to a family of multiple EF-hand proteins (CERC) that include reticulocalbin, ERC-55, and Cab45 and the product of this gene. Alternatively spliced transcript variants encoding different isoforms have been identified. | CALU | ENSG00000128595 |
| 5327 | plasminogen activator, tissue type | This gene encodes tissue-type plasminogen activator, a secreted serine protease that converts the proenzyme plasminogen to plasmin, a fibrinolytic enzyme. The encoded preproprotein is proteolytically processed by plasmin or trypsin to generate heavy and light chains. These chains associate via disulfide linkages to form the heterodimeric enzyme. This enzyme plays a role in cell migration and tissue remodeling. Increased enzymatic activity causes hyperfibrinolysis, which manifests as excessive bleeding, while decreased activity leads to hypofibrinolysis, which can result in thrombosis or embolism. Alternative splicing of this gene results in multiple transcript variants, at least one of which encodes an isoform that is proteolytically processed. | PLAT | ENSG00000104368 |
| 7026 | nuclear receptor subfamily 2 group F member 2 | This gene encodes a member of the steroid thyroid hormone superfamily of nuclear receptors. The encoded protein is a ligand inducible transcription factor that is involved in the regulation of many different genes. Alternate splicing results in multiple transcript variants. | NR2F2 | ENSG00000185551 |
| 9770 | Ras association domain family member 2 | This gene encodes a protein that contains a Ras association domain. Similar to its cattle and sheep counterparts, this gene is located near the prion gene. Two alternatively spliced transcripts encoding the same isoform have been reported. | RASSF2 | ENSG00000101265 |
| 8613 | phospholipid phosphatase 3 | The protein encoded by this gene is a member of the phosphatidic acid phosphatase (PAP) family. PAPs convert phosphatidic acid to diacylglycerol, and function in de novo synthesis of glycerolipids as well as in receptor-activated signal transduction mediated by phospholipase D. This protein is a membrane glycoprotein localized at the cell plasma membrane. It has been shown to actively hydrolyze extracellular lysophosphatidic acid and short-chain phosphatidic acid. The expression of this gene is found to be enhanced by epidermal growth factor in Hela cells. | PLPP3 | ENSG00000162407 |
| ENSG00000251322 | SH3 and multiple ankyrin repeat domains 3 | NA | SHANK3 | ENSG00000251322 |
| 9741 | lysosomal protein transmembrane 4 alpha | This gene encodes a protein that has four predicted transmembrane domains. The function of this gene has not yet been determined; however, studies in the mouse homolog suggest a role in the transport of small molecules across endosomal and lysosomal membranes. | LAPTM4A | ENSG00000068697 |
| 7077 | TIMP metallopeptidase inhibitor 2 | This gene is a member of the TIMP gene family. The proteins encoded by this gene family are natural inhibitors of the matrix metalloproteinases, a group of peptidases involved in degradation of the extracellular matrix. In addition to an inhibitory role against metalloproteinases, the encoded protein has a unique role among TIMP family members in its ability to directly suppress the proliferation of endothelial cells. As a result, the encoded protein may be critical to the maintenance of tissue homeostasis by suppressing the proliferation of quiescent tissues in response to angiogenic factors, and by inhibiting protease activity in tissues undergoing remodelling of the extracellular matrix. | TIMP2 | ENSG00000035862 |
| 1291 | collagen type VI alpha 1 | The collagens are a superfamily of proteins that play a role in maintaining the integrity of various tissues. Collagens are extracellular matrix proteins and have a triple-helical domain as their common structural element. Collagen VI is a major structural component of microfibrils. The basic structural unit of collagen VI is a heterotrimer of the alpha1(VI), alpha2(VI), and alpha3(VI) chains. The alpha2(VI) and alpha3(VI) chains are encoded by the COL6A2 and COL6A3 genes, respectively. The protein encoded by this gene is the alpha 1 subunit of type VI collagen (alpha1(VI) chain). Mutations in the genes that code for the collagen VI subunits result in the autosomal dominant disorder, Bethlem myopathy. | COL6A1 | ENSG00000142156 |
| 60681 | FK506 binding protein 10 | The protein encoded by this gene belongs to the FKBP-type peptidyl-prolyl cis/trans isomerase (PPIase) family. This protein localizes to the endoplasmic reticulum and acts as a molecular chaperone. Alternatively spliced variants encoding different isoforms have been reported, but their biological validity has not been determined. | FKBP10 | ENSG00000141756 |
| 6507 | solute carrier family 1 member 3 | This gene encodes a member of a member of a high affinity glutamate transporter family. This gene functions in the termination of excitatory neurotransmission in central nervous system. Mutations are associated with episodic ataxia, Type 6. Alternative splicing results in multiple transcript variants. | SLC1A3 | ENSG00000079215 |
| 7531 | tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein epsilon | This gene product belongs to the 14-3-3 family of proteins which mediate signal transduction by binding to phosphoserine-containing proteins. This highly conserved protein family is found in both plants and mammals, and this protein is 100% identical to the mouse ortholog. It interacts with CDC25 phosphatases, RAF1 and IRS1 proteins, suggesting its role in diverse biochemical activities related to signal transduction, such as cell division and regulation of insulin sensitivity. It has also been implicated in the pathogenesis of small cell lung cancer. Two transcript variants, one protein-coding and the other non-protein-coding, have been found for this gene. | YWHAE | ENSG00000108953 |
| 4703 | nebulin | This gene encodes nebulin, a giant protein component of the cytoskeletal matrix that coexists with the thick and thin filaments within the sarcomeres of skeletal muscle. In most vertebrates, nebulin accounts for 3 to 4% of the total myofibrillar protein. The encoded protein contains approximately 30-amino acid long modules that can be classified into 7 types and other repeated modules. Protein isoform sizes vary from 600 to 800 kD due to alternative splicing that is tissue-, species-,and developmental stage-specific. Of the 183 exons in the nebulin gene, at least 43 are alternatively spliced, although exons 143 and 144 are not found in the same transcript. Of the several thousand transcript variants predicted for nebulin, the RefSeq Project has decided to create three representative RefSeq records. Mutations in this gene are associated with recessive nemaline myopathy. | NEB | ENSG00000183091 |
| 6515 | solute carrier family 2 member 3 | NA | SLC2A3 | ENSG00000059804 |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",6,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[7,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| query | symbol | X_id | summary | name | notfound |
|---|---|---|---|---|---|
| ENSG00000244734 | HBB | 3043 | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | hemoglobin subunit beta | NA |
| ENSG00000197616 | MYH6 | 4624 | Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. | myosin, heavy chain 6, cardiac muscle, alpha | NA |
| ENSG00000165795 | NDRG2 | 57447 | This gene is a member of the N-myc downregulated gene family which belongs to the alpha/beta hydrolase superfamily. The protein encoded by this gene is a cytoplasmic protein that may play a role in neurite outgrowth. This gene may be involved in glioblastoma carcinogenesis. Several alternatively spliced transcript variants of this gene have been described, but the full-length nature of some of these variants has not been determined. | NDRG family member 2 | NA |
| ENSG00000135821 | GLUL | 2752 | The protein encoded by this gene belongs to the glutamine synthetase family. It catalyzes the synthesis of glutamine from glutamate and ammonia in an ATP-dependent reaction. This protein plays a role in ammonia and glutamate detoxification, acid-base homeostasis, cell signaling, and cell proliferation. Glutamine is an abundant amino acid, and is important to the biosynthesis of several amino acids, pyrimidines, and purines. Mutations in this gene are associated with congenital glutamine deficiency, and overexpression of this gene was observed in some primary liver cancer samples. There are six pseudogenes of this gene found on chromosomes 2, 5, 9, 11, and 12. Alternative splicing results in multiple transcript variants. | glutamate-ammonia ligase | NA |
| ENSG00000104879 | CKM | 1158 | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis and is an important serum marker for myocardial infarction. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in striated muscle as well as in other tissues, and as a heterodimer with a similar brain isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. | creatine kinase, M-type | NA |
| ENSG00000175206 | NPPA | 4878 | The protein encoded by this gene belongs to the natriuretic peptide family. Natriuretic peptides are implicated in the control of extracellular fluid volume and electrolyte homeostasis. This protein is synthesized as a large precursor (containing a signal peptide), which is processed to release a peptide from the N-terminus with similarity to vasoactive peptide, cardiodilatin, and another peptide from the C-terminus with natriuretic-diuretic activity. Mutations in this gene have been associated with atrial fibrillation familial type 6. This gene is located adjacent to another member of the natriuretic family of peptides on chromosome 1. | natriuretic peptide A | NA |
| ENSG00000189058 | APOD | 347 | This gene encodes a component of high density lipoprotein that has no marked similarity to other apolipoprotein sequences. It has a high degree of homology to plasma retinol-binding protein and other members of the alpha 2 microglobulin protein superfamily of carrier proteins, also known as lipocalins. This glycoprotein is closely associated with the enzyme lecithin:cholesterol acyltransferase - an enzyme involved in lipoprotein metabolism. | apolipoprotein D | NA |
| ENSG00000077522 | ACTN2 | 88 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a muscle-specific, alpha actinin isoform that is expressed in both skeletal and cardiac muscles. Several transcript variants encoding different isoforms have been found for this gene. | actinin alpha 2 | NA |
| ENSG00000168309 | FAM107A | 11170 | NA | family with sequence similarity 107 member A | NA |
| ENSG00000131095 | GFAP | 2670 | This gene encodes one of the major intermediate filament proteins of mature astrocytes. It is used as a marker to distinguish astrocytes from other glial cells during development. Mutations in this gene cause Alexander disease, a rare disorder of astrocytes in the central nervous system. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | glial fibrillary acidic protein | NA |
| ENSG00000099194 | SCD | 6319 | This gene encodes an enzyme involved in fatty acid biosynthesis, primarily the synthesis of oleic acid. The protein belongs to the fatty acid desaturase family and is an integral membrane protein located in the endoplasmic reticulum. Transcripts of approximately 3.9 and 5.2 kb, differing only by alternative polyadenlyation signals, have been detected. A gene encoding a similar enzyme is located on chromosome 4 and a pseudogene of this gene is located on chromosome 17. | stearoyl-CoA desaturase | NA |
| ENSG00000021300 | PLEKHB1 | 58473 | NA | pleckstrin homology domain containing B1 | NA |
| ENSG00000130294 | KIF1A | 547 | The protein encoded by this gene is a member of the kinesin family and functions as an anterograde motor protein that transports membranous organelles along axonal microtubules. Mutations at this locus have been associated with spastic paraplegia-30 and hereditary sensory neuropathy IIC. Alternatively spliced transcript variants encoding distinct isoforms have been described. | kinesin family member 1A | NA |
| ENSG00000167588 | GPD1 | 2819 | This gene encodes a member of the NAD-dependent glycerol-3-phosphate dehydrogenase family. The encoded protein plays a critical role in carbohydrate and lipid metabolism by catalyzing the reversible conversion of dihydroxyacetone phosphate (DHAP) and reduced nicotine adenine dinucleotide (NADH) to glycerol-3-phosphate (G3P) and NAD+. The encoded cytosolic protein and mitochondrial glycerol-3-phosphate dehydrogenase also form a glycerol phosphate shuttle that facilitates the transfer of reducing equivalents from the cytosol to mitochondria. Mutations in this gene are a cause of transient infantile hypertriglyceridemia. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | glycerol-3-phosphate dehydrogenase 1 | NA |
| ENSG00000137801 | THBS1 | 7057 | The protein encoded by this gene is a subunit of a disulfide-linked homotrimeric protein. This protein is an adhesive glycoprotein that mediates cell-to-cell and cell-to-matrix interactions. This protein can bind to fibrinogen, fibronectin, laminin, type V collagen and integrins alpha-V/beta-1. This protein has been shown to play roles in platelet aggregation, angiogenesis, and tumorigenesis. | thrombospondin 1 | NA |
| ENSG00000175445 | LPL | 4023 | LPL encodes lipoprotein lipase, which is expressed in heart, muscle, and adipose tissue. LPL functions as a homodimer, and has the dual functions of triglyceride hydrolase and ligand/bridging factor for receptor-mediated lipoprotein uptake. Severe mutations that cause LPL deficiency result in type I hyperlipoproteinemia, while less extreme mutations in LPL are linked to many disorders of lipoprotein metabolism. | lipoprotein lipase | NA |
| ENSG00000175084 | DES | 1674 | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | desmin | NA |
| ENSG00000211445 | GPX3 | 2878 | This gene product belongs to the glutathione peroxidase family, which functions in the detoxification of hydrogen peroxide. It contains a selenocysteine (Sec) residue at its active site. The selenocysteine is encoded by the UGA codon, which normally signals translation termination. The 3’ UTR of Sec-containing genes have a common stem-loop structure, the sec insertion sequence (SECIS), which is necessary for the recognition of UGA as a Sec codon rather than as a stop signal. | glutathione peroxidase 3 | NA |
| ENSG00000198467 | TPM2 | 7169 | This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | tropomyosin 2 (beta) | NA |
| ENSG00000206172 | HBA1 | 3039 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | hemoglobin subunit alpha 1 | NA |
| ENSG00000106631 | MYL7 | 58498 | NA | myosin light chain 7 | NA |
| ENSG00000149925 | ALDOA | 226 | The protein encoded by this gene, Aldolase A (fructose-bisphosphate aldolase), is a glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Three aldolase isozymes (A, B, and C), encoded by three different genes, are differentially expressed during development. Aldolase A is found in the developing embryo and is produced in even greater amounts in adult muscle. Aldolase A expression is repressed in adult liver, kidney and intestine and similar to aldolase C levels in brain and other nervous tissue. Aldolase A deficiency has been associated with myopathy and hemolytic anemia. Alternative splicing and alternative promoter usage results in multiple transcript variants. Related pseudogenes have been identified on chromosomes 3 and 10. | aldolase, fructose-bisphosphate A | NA |
| ENSG00000130203 | APOE | 348 | The protein encoded by this gene is a major apoprotein of the chylomicron. It binds to a specific liver and peripheral cell receptor, and is essential for the normal catabolism of triglyceride-rich lipoprotein constituents. This gene maps to chromosome 19 in a cluster with the related apolipoprotein C1 and C2 genes. Mutations in this gene result in familial dysbetalipoproteinemia, or type III hyperlipoproteinemia (HLP III), in which increased plasma cholesterol and triglycerides are the consequence of impaired clearance of chylomicron and VLDL remnants. Alternative splicing results in multiple transcript variants. | apolipoprotein E | NA |
| ENSG00000112531 | QKI | 9444 | The protein encoded by this gene is an RNA-binding protein that regulates pre-mRNA splicing, export of mRNAs from the nucleus, protein translation, and mRNA stability. The encoded protein is involved in myelinization and oligodendrocyte differentiation and may play a role in schizophrenia. Multiple transcript variants encoding different isoforms have been found for this gene. | QKI, KH domain containing, RNA binding | NA |
| ENSG00000143632 | ACTA1 | 58 | The product encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Mutations in this gene cause nemaline myopathy type 3, congenital myopathy with excess of thin myofilaments, congenital myopathy with cores, and congenital myopathy with fiber-type disproportion, diseases that lead to muscle fiber defects. | actin, alpha 1, skeletal muscle | NA |
| ENSG00000197971 | MBP | 4155 | The protein encoded by the classic MBP gene is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. However, MBP-related transcripts are also present in the bone marrow and the immune system. These mRNAs arise from the long MBP gene (otherwise called ‘Golli-MBP’) that contains 3 additional exons located upstream of the classic MBP exons. Alternative splicing from the Golli and the MBP transcription start sites gives rise to 2 sets of MBP-related transcripts and gene products. The Golli mRNAs contain 3 exons unique to Golli-MBP, spliced in-frame to 1 or more MBP exons. They encode hybrid proteins that have N-terminal Golli aa sequence linked to MBP aa sequence. The second family of transcripts contain only MBP exons and produce the well characterized myelin basic proteins. This complex gene structure is conserved among species suggesting that the MBP transcription unit is an integral part of the Golli transcription unit and that this arrangement is important for the function and/or regulation of these genes. | myelin basic protein | NA |
| ENSG00000076555 | ACACB | 32 | Acetyl-CoA carboxylase (ACC) is a complex multifunctional enzyme system. ACC is a biotin-containing enzyme which catalyzes the carboxylation of acetyl-CoA to malonyl-CoA, the rate-limiting step in fatty acid synthesis. ACC-beta is thought to control fatty acid oxidation by means of the ability of malonyl-CoA to inhibit carnitine-palmitoyl-CoA transferase I, the rate-limiting step in fatty acid uptake and oxidation by mitochondria. ACC-beta may be involved in the regulation of fatty acid oxidation, rather than fatty acid biosynthesis. There is evidence for the presence of two ACC-beta isoforms. | acetyl-CoA carboxylase beta | NA |
| ENSG00000136717 | BIN1 | 274 | This gene encodes several isoforms of a nucleocytoplasmic adaptor protein, one of which was initially identified as a MYC-interacting protein with features of a tumor suppressor. Isoforms that are expressed in the central nervous system may be involved in synaptic vesicle endocytosis and may interact with dynamin, synaptojanin, endophilin, and clathrin. Isoforms that are expressed in muscle and ubiquitously expressed isoforms localize to the cytoplasm and nucleus and activate a caspase-independent apoptotic process. Studies in mouse suggest that this gene plays an important role in cardiac muscle development. Alternate splicing of the gene results in several transcript variants encoding different isoforms. Aberrant splice variants expressed in tumor cell lines have also been described. | bridging integrator 1 | NA |
| ENSG00000118194 | TNNT2 | 7139 | The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. Mutations in this gene have been associated with familial hypertrophic cardiomyopathy as well as with dilated cardiomyopathy. Transcripts for this gene undergo alternative splicing that results in many tissue-specific isoforms, however, the full-length nature of some of these variants has not yet been determined. | troponin T2, cardiac type | NA |
| ENSG00000140416 | TPM1 | 7168 | This gene is a member of the tropomyosin family of highly conserved, widely distributed actin-binding proteins involved in the contractile system of striated and smooth muscles and the cytoskeleton of non-muscle cells. Tropomyosin is composed of two alpha-helical chains arranged as a coiled-coil. It is polymerized end to end along the two grooves of actin filaments and provides stability to the filaments. The encoded protein is one type of alpha helical chain that forms the predominant tropomyosin of striated muscle, where it also functions in association with the troponin complex to regulate the calcium-dependent interaction of actin and myosin during muscle contraction. In smooth muscle and non-muscle cells, alternatively spliced transcript variants encoding a range of isoforms have been described. Mutations in this gene are associated with type 3 familial hypertrophic cardiomyopathy. | tropomyosin 1 (alpha) | NA |
| ENSG00000121769 | FABP3 | 2170 | The intracellular fatty acid-binding proteins (FABPs) belongs to a multigene family. FABPs are divided into at least three distinct types, namely the hepatic-, intestinal- and cardiac-type. They form 14-15 kDa proteins and are thought to participate in the uptake, intracellular metabolism and/or transport of long-chain fatty acids. They may also be responsible in the modulation of cell growth and proliferation. Fatty acid-binding protein 3 gene contains four exons and its function is to arrest growth of mammary epithelial cells. This gene is a candidate tumor suppressor gene for human breast cancer. Alternative splicing results in multiple transcript variants. | fatty acid binding protein 3 | NA |
| ENSG00000166165 | CKB | 1152 | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in brain as well as in other tissues, and as a heterodimer with a similar muscle isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. A pseudogene of this gene has been characterized. | creatine kinase B | NA |
| ENSG00000159251 | ACTC1 | 70 | Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC). | actin, alpha, cardiac muscle 1 | NA |
| ENSG00000170477 | KRT4 | 3851 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | keratin 4 | NA |
| ENSG00000111245 | MYL2 | 4633 | Thus gene encodes the regulatory light chain associated with cardiac myosin beta (or slow) heavy chain. Ca+ triggers the phosphorylation of regulatory light chain that in turn triggers contraction. Mutations in this gene are associated with mid-left ventricular chamber type hypertrophic cardiomyopathy. | myosin light chain 2 | NA |
| ENSG00000075624 | ACTB | 60 | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | actin, beta | NA |
| ENSG00000107331 | ABCA2 | 20 | The membrane-associated protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intracellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the ABC1 subfamily. Members of the ABC1 subfamily comprise the only major ABC subfamily found exclusively in multicellular eukaryotes. This protein is highly expressed in brain tissue and may play a role in macrophage lipid metabolism and neural development. Two transcript variants encoding different isoforms have been found for this gene. | ATP binding cassette subfamily A member 2 | NA |
| ENSG00000014641 | MDH1 | 4190 | This gene encodes an enzyme that catalyzes the NAD/NADH-dependent, reversible oxidation of malate to oxaloacetate in many metabolic pathways, including the citric acid cycle. Two main isozymes are known to exist in eukaryotic cells: one is found in the mitochondrial matrix and the other in the cytoplasm. This gene encodes the cytosolic isozyme, which plays a key role in the malate-aspartate shuttle that allows malate to pass through the mitochondrial membrane to be transformed into oxaloacetate for further cellular processes. Alternatively spliced transcript variants have been found for this gene. A recent study showed that a C-terminally extended isoform is produced by use of an alternative in-frame translation termination codon via a stop codon readthrough mechanism, and that this isoform is localized in the peroxisomes. Pseudogenes have been identified on chromosomes X and 6. | malate dehydrogenase 1 | NA |
| ENSG00000117115 | PADI2 | 11240 | This gene encodes a member of the peptidyl arginine deiminase family of enzymes, which catalyze the post-translational deimination of proteins by converting arginine residues into citrullines in the presence of calcium ions. The family members have distinct substrate specificities and tissue-specific expression patterns. The type II enzyme is the most widely expressed family member. Known substrates for this enzyme include myelin basic protein in the central nervous system and vimentin in skeletal muscle and macrophages. This enzyme is thought to play a role in the onset and progression of neurodegenerative human disorders, including Alzheimer disease and multiple sclerosis, and it has also been implicated in glaucoma pathogenesis. This gene exists in a cluster with four other paralogous genes. | peptidyl arginine deiminase 2 | NA |
| ENSG00000242349 | NPPA-AS1 | ENSG00000242349 | NA | NPPA antisense RNA 1 | NA |
| ENSG00000167468 | GPX4 | 2879 | This gene encodes a member of the glutathione peroxidase protein family. Glutathione peroxidase catalyzes the reduction of hydrogen peroxide, organic hydroperoxide, and lipid peroxides by reduced glutathione and functions in the protection of cells against oxidative damage. Human plasma glutathione peroxidase has been shown to be a selenium-containing enzyme and the UGA codon is translated into a selenocysteine. The encoded protein has been identified as a moonlighting protein based on its ability to serve dual functions as a peroxidase as well as a structural protein in mature spermatozoa. Through alternative splicing and transcription initiation, rat produces proteins that localize to the nucleus, mitochondrion, and cytoplasm. In humans, alternative transcription initiation and the cleavage sites of the mitochondrial and nuclear transit peptides need to be experimentally verified. Alternative splicing results in multiple transcript variants. | glutathione peroxidase 4 | NA |
| ENSG00000167460 | TPM4 | 7171 | This gene encodes a member of the tropomyosin family of actin-binding proteins involved in the contractile system of striated and smooth muscles and the cytoskeleton of non-muscle cells. Tropomyosins are dimers of coiled-coil proteins that polymerize end-to-end along the major groove in most actin filaments. They provide stability to the filaments and regulate access of other actin-binding proteins. In muscle cells, they regulate muscle contraction by controlling the binding of myosin heads to the actin filament. Multiple transcript variants encoding different isoforms have been found for this gene. | tropomyosin 4 | NA |
| ENSG00000148677 | ANKRD1 | 27063 | The protein encoded by this gene is localized to the nucleus of endothelial cells and is induced by IL-1 and TNF-alpha stimulation. Studies in rat cardiomyocytes suggest that this gene functions as a transcription factor. Interactions between this protein and the sarcomeric proteins myopalladin and titin suggest that it may also be involved in the myofibrillar stretch-sensor system. | ankyrin repeat domain 1 | NA |
| ENSG00000145284 | SCD5 | 79966 | Stearoyl-CoA desaturase (SCD; EC 1.14.99.5) is an integral membrane protein of the endoplasmic reticulum that catalyzes the formation of monounsaturated fatty acids from saturated fatty acids. SCD may be a key regulator of energy metabolism with a role in obesity and dislipidemia. Four SCD isoforms, Scd1 through Scd4, have been identified in mouse. In contrast, only 2 SCD isoforms, SCD1 (MIM 604031) and SCD5, have been identified in human. SCD1 shares about 85% amino acid identity with all 4 mouse SCD isoforms, as well as with rat Scd1 and Scd2. In contrast, SCD5 shares limited homology with the rodent SCDs and appears to be unique to primates (Wang et al., 2005 [PubMed 15907797]). | stearoyl-CoA desaturase 5 | NA |
| ENSG00000122304 | PRM2 | 5620 | Protamines substitute for histones in the chromatin of sperm during the haploid phase of spermatogenesis, and are the major DNA-binding proteins in the nucleus of sperm in many vertebrates. They package the sperm DNA into a highly condensed complex in a volume less than 5% of a somatic cell nucleus. Many mammalian species have only one protamine (protamine 1); however, a few species, including human and mouse, have two. This gene encodes protamine 2, which is cleaved to give rise to a family of protamine 2 peptides. Alternatively spliced transcript variants have also been found for this gene. | protamine 2 | NA |
| ENSG00000169710 | FASN | 2194 | The enzyme encoded by this gene is a multifunctional protein. Its main function is to catalyze the synthesis of palmitate from acetyl-CoA and malonyl-CoA, in the presence of NADPH, into long-chain saturated fatty acids. In some cancer cell lines, this protein has been found to be fused with estrogen receptor-alpha (ER-alpha), in which the N-terminus of FAS is fused in-frame with the C-terminus of ER-alpha. | fatty acid synthase | NA |
| ENSG00000152137 | HSPB8 | 26353 | The protein encoded by this gene belongs to the superfamily of small heat-shock proteins containing a conservative alpha-crystallin domain at the C-terminal part of the molecule. The expression of this gene in induced by estrogen in estrogen receptor-positive breast cancer cells, and this protein also functions as a chaperone in association with Bag3, a stimulator of macroautophagy. Thus, this gene appears to be involved in regulation of cell proliferation, apoptosis, and carcinogenesis, and mutations in this gene have been associated with different neuromuscular diseases, including Charcot-Marie-Tooth disease. | heat shock protein family B (small) member 8 | NA |
| ENSG00000155657 | TTN | 7273 | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | titin | NA |
| ENSG00000145362 | ANK2 | 287 | This gene encodes a member of the ankyrin family of proteins that link the integral membrane proteins to the underlying spectrin-actin cytoskeleton. Ankyrins play key roles in activities such as cell motility, activation, proliferation, contact and the maintenance of specialized membrane domains. Most ankyrins are typically composed of three structural domains: an amino-terminal domain containing multiple ankyrin repeats; a central region with a highly conserved spectrin binding domain; and a carboxy-terminal regulatory domain which is the least conserved and subject to variation. The protein encoded by this gene is required for targeting and stability of Na/Ca exchanger 1 in cardiomyocytes. Mutations in this gene cause long QT syndrome 4 and cardiac arrhythmia syndrome. Multiple transcript variants encoding different isoforms have been described. | ankyrin 2, neuronal | NA |
| ENSG00000188536 | HBA2 | 3040 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | hemoglobin subunit alpha 2 | NA |
| ENSG00000256545 | NA | NA | NA | NA | TRUE |
| ENSG00000184009 | ACTG1 | 71 | Actins are highly conserved proteins that are involved in various types of cell motility, and maintenance of the cytoskeleton. In vertebrates, three main groups of actin isoforms, alpha, beta and gamma have been identified. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton, and as mediators of internal cell motility. Actin, gamma 1, encoded by this gene, is a cytoplasmic actin found in non-muscle cells. Mutations in this gene are associated with DFNA20/26, a subtype of autosomal dominant non-syndromic sensorineural progressive hearing loss. Alternative splicing results in multiple transcript variants. | actin gamma 1 | NA |
| ENSG00000087250 | MT3 | 4504 | NA | metallothionein 3 | NA |
| ENSG00000121653 | MAPK8IP1 | 9479 | This gene encodes a regulator of the pancreatic beta-cell function. It is highly similar to JIP-1, a mouse protein known to be a regulator of c-Jun amino-terminal kinase (Mapk8). This protein has been shown to prevent MAPK8 mediated activation of transcription factors, and to decrease IL-1 beta and MAP kinase kinase 1 (MEKK1) induced apoptosis in pancreatic beta cells. This protein also functions as a DNA-binding transactivator of the glucose transporter GLUT2. RE1-silencing transcription factor (REST) is reported to repress the expression of this gene in insulin-secreting beta cells. This gene is found to be mutated in a type 2 diabetes family, and thus is thought to be a susceptibility gene for type 2 diabetes. | mitogen-activated protein kinase 8 interacting protein 1 | NA |
| ENSG00000134571 | MYBPC3 | 4607 | MYBPC3 encodes the cardiac isoform of myosin-binding protein C. Myosin-binding protein C is a myosin-associated protein found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. MYBPC3, the cardiac isoform, is expressed exclussively in heart muscle. Regulatory phosphorylation of the cardiac isoform in vivo by cAMP-dependent protein kinase (PKA) upon adrenergic stimulation may be linked to modulation of cardiac contraction. Mutations in MYBPC3 are one cause of familial hypertrophic cardiomyopathy. | myosin binding protein C, cardiac | NA |
| ENSG00000106624 | AEBP1 | 165 | This gene encodes a member of carboxypeptidase A protein family. The encoded protein may function as a transcriptional repressor and play a role in adipogenesis and smooth muscle cell differentiation. Studies in mice suggest that this gene functions in wound healing and abdominal wall development. Overexpression of this gene is associated with glioblastoma. | AE binding protein 1 | NA |
| ENSG00000171401 | KRT13 | 3860 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | keratin 13 | NA |
| ENSG00000163209 | SPRR3 | 6707 | NA | small proline rich protein 3 | NA |
| ENSG00000189043 | NDUFA4 | 4697 | The protein encoded by this gene belongs to the complex I 9kDa subunit family. Mammalian complex I of mitochondrial respiratory chain is composed of 45 different subunits. This protein has NADH dehydrogenase activity and oxidoreductase activity. It transfers electrons from NADH to the respiratory chain. The immediate electron acceptor for the enzyme is believed to be ubiquinone. | NDUFA4, mitochondrial complex associated | NA |
| ENSG00000018625 | ATP1A2 | 477 | The protein encoded by this gene belongs to the family of P-type cation transport ATPases, and to the subfamily of Na+/K+ -ATPases. Na+/K+ -ATPase is an integral membrane protein responsible for establishing and maintaining the electrochemical gradients of Na and K ions across the plasma membrane. These gradients are essential for osmoregulation, for sodium-coupled transport of a variety of organic and inorganic molecules, and for electrical excitability of nerve and muscle. This enzyme is composed of two subunits, a large catalytic subunit (alpha) and a smaller glycoprotein subunit (beta). The catalytic subunit of Na+/K+ -ATPase is encoded by multiple genes. This gene encodes an alpha 2 subunit. Mutations in this gene result in familial basilar or hemiplegic migraines, and in a rare syndrome known as alternating hemiplegia of childhood. | ATPase Na+/K+ transporting subunit alpha 2 | NA |
| ENSG00000092054 | MYH7 | 4625 | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | myosin, heavy chain 7, cardiac muscle, beta | NA |
| ENSG00000106366 | SERPINE1 | 5054 | This gene encodes a member of the serine proteinase inhibitor (serpin) superfamily. This member is the principal inhibitor of tissue plasminogen activator (tPA) and urokinase (uPA), and hence is an inhibitor of fibrinolysis. Defects in this gene are the cause of plasminogen activator inhibitor-1 deficiency (PAI-1 deficiency), and high concentrations of the gene product are associated with thrombophilia. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | serpin family E member 1 | NA |
| ENSG00000196616 | ADH1B | 125 | The protein encoded by this gene is a member of the alcohol dehydrogenase family. Members of this enzyme family metabolize a wide variety of substrates, including ethanol, retinol, other aliphatic alcohols, hydroxysteroids, and lipid peroxidation products. This encoded protein, consisting of several homo- and heterodimers of alpha, beta, and gamma subunits, exhibits high activity for ethanol oxidation and plays a major role in ethanol catabolism. Three genes encoding alpha, beta and gamma subunits are tandemly organized in a genomic segment as a gene cluster. Two transcript variants encoding different isoforms have been found for this gene. | alcohol dehydrogenase 1B (class I), beta polypeptide | NA |
| ENSG00000168280 | KIF5C | 3800 | The protein encoded by this gene is a kinesin heavy chain subunit involved in the transport of cargo within the central nervous system. The encoded protein, which acts as a tetramer by associating with another heavy chain and two light chains, interacts with protein kinase CK2. Mutations in this gene have been associated with complex cortical dysplasia with other brain malformations-2. Two transcript variants, one protein-coding and the other non-protein coding, have been found for this gene. | kinesin family member 5C | NA |
| ENSG00000106772 | PRUNE2 | 158471 | The protein encoded by this gene belongs to the B-cell CLL/lymphoma 2 and adenovirus E1B 19 kDa interacting family, whose members play roles in many cellular processes including apotosis, cell transformation, and synaptic function. Several functions for this protein have been demonstrated including suppression of Ras homolog family member A activity, which results in reduced stress fiber formation and suppression of oncogenic cellular transformation. A high molecular weight isoform of this protein has also been shown to colocalize with Adaptor protein complex 2, beta-Adaptin and endodermal markers, suggesting an involvement in post-endocytic trafficking. In prostate cancer cells, this gene acts as a tumor suppressor and its expression is regulated by prostate cancer antigen 3, a non-protein coding gene on the opposite DNA strand in an intron of this gene. Prostate cancer antigen 3 regulates levels of this gene through formation of a double-stranded RNA that undergoes adenosine deaminase actin on RNA-dependent adenosine-to-inosine RNA editing. Alternative splicing results in multiple transcript variants. | prune homolog 2 | NA |
| ENSG00000161281 | COX7A1 | 1346 | Cytochrome c oxidase (COX), the terminal component of the mitochondrial respiratory chain, catalyzes the electron transfer from reduced cytochrome c to oxygen. This component is a heteromeric complex consisting of 3 catalytic subunits encoded by mitochondrial genes and multiple structural subunits encoded by nuclear genes. The mitochondrially-encoded subunits function in electron transfer, and the nuclear-encoded subunits may function in the regulation and assembly of the complex. This nuclear gene encodes polypeptide 1 (muscle isoform) of subunit VIIa and the polypeptide 1 is present only in muscle tissues. Other polypeptides of subunit VIIa are present in both muscle and nonmuscle tissues, and are encoded by different genes. | cytochrome c oxidase subunit 7A1 | NA |
| ENSG00000171992 | SYNPO | 11346 | Synaptopodin is an actin-associated protein that may play a role in actin-based cell shape and motility. The name synaptopodin derives from the protein’s associations with postsynaptic densities and dendritic spines and with renal podocytes (Mundel et al., 1997 [PubMed 9314539]). | synaptopodin | NA |
| ENSG00000131771 | PPP1R1B | 84152 | This gene encodes a bifunctional signal transduction molecule. Dopaminergic and glutamatergic receptor stimulation regulates its phosphorylation and function as a kinase or phosphatase inhibitor. As a target for dopamine, this gene may serve as a therapeutic target for neurologic and psychiatric disorders. Multiple transcript variants encoding different isoforms have been found for this gene. | protein phosphatase 1 regulatory inhibitor subunit 1B | NA |
| ENSG00000115306 | SPTBN1 | 6711 | Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. This gene is one member of a family of beta-spectrin genes. The encoded protein contains an N-terminal actin-binding domain, and 17 spectrin repeats which are involved in dimer formation. Multiple transcript variants encoding different isoforms have been found for this gene. | spectrin beta, non-erythrocytic 1 | NA |
| ENSG00000196091 | MYBPC1 | 4604 | This gene encodes a member of the myosin-binding protein C family. Myosin-binding protein C family members are myosin-associated proteins found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The encoded protein is the slow skeletal muscle isoform of myosin-binding protein C and plays an important role in muscle contraction by recruiting muscle-type creatine kinase to myosin filaments. Mutations in this gene are associated with distal arthrogryposis type I. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | myosin binding protein C, slow type | NA |
| ENSG00000160209 | LOC105372824 | 105372824 | NA | uncharacterized LOC105372824 | NA |
| ENSG00000160209 | PDXK | 8566 | The protein encoded by this gene phosphorylates vitamin B6, a step required for the conversion of vitamin B6 to pyridoxal-5-phosphate, an important cofactor in intermediary metabolism. The encoded protein is cytoplasmic and probably acts as a homodimer. Alternatively spliced transcript variants have been described, but their biological validity has not been determined. | pyridoxal (pyridoxine, vitamin B6) kinase | NA |
| ENSG00000157827 | FMNL2 | 114793 | This gene encodes a formin-related protein. Formin-related proteins have been implicated in morphogenesis, cytokinesis, and cell polarity. Alternatively spliced transcript variants encoding different isoforms have been described but their full-length nature has yet to be determined. | formin like 2 | NA |
| ENSG00000089220 | PEBP1 | 5037 | This gene encodes a member of the phosphatidylethanolamine-binding family of proteins and has been shown to modulate multiple signaling pathways, including the MAP kinase (MAPK), NF-kappa B, and glycogen synthase kinase-3 (GSK-3) signaling pathways. The encoded protein can be further processed to form a smaller cleavage product, hippocampal cholinergic neurostimulating peptide (HCNP), which may be involved in neural development. This gene has been implicated in numerous human cancers and may act as a metastasis suppressor gene. Multiple pseudogenes of this gene have been identified in the genome. | phosphatidylethanolamine binding protein 1 | NA |
| ENSG00000179364 | PACS2 | 23241 | NA | phosphofurin acidic cluster sorting protein 2 | NA |
| ENSG00000078114 | NEBL | 10529 | This gene encodes a nebulin like protein that is abundantly expressed in cardiac muscle. The encoded protein binds actin and interacts with thin filaments and Z-line associated proteins in striated muscle. This protein may be involved in cardiac myofibril assembly. A shorter isoform of this protein termed LIM nebulette is expressed in non-muscle cells and may function as a component of focal adhesion complexes. Alternate splicing results in multiple transcript variants. | nebulette | NA |
| ENSG00000266844 | RP11-862L9.3 | ENSG00000266844 | NA | NA | NA |
| ENSG00000119927 | GPAM | 57678 | This gene encodes a mitochondrial enzyme which prefers saturated fatty acids as its substrate for the synthesis of glycerolipids. This metabolic pathway’s first step is catalyzed by the encoded enzyme. Two forms for this enzyme exist, one in the mitochondria and one in the endoplasmic reticulum. Two alternatively spliced transcript variants have been described for this gene. | glycerol-3-phosphate acyltransferase, mitochondrial | NA |
| ENSG00000130176 | CNN1 | 1264 | NA | calponin 1 | NA |
| ENSG00000006282 | SPATA20 | 64847 | NA | spermatogenesis associated 20 | NA |
| ENSG00000107796 | ACTA2 | 59 | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | actin, alpha 2, smooth muscle, aorta | NA |
| ENSG00000198523 | PLN | 5350 | The protein encoded by this gene is found as a pentamer and is a major substrate for the cAMP-dependent protein kinase in cardiac muscle. The encoded protein is an inhibitor of cardiac muscle sarcoplasmic reticulum Ca(2+)-ATPase in the unphosphorylated state, but inhibition is relieved upon phosphorylation of the protein. The subsequent activation of the Ca(2+) pump leads to enhanced muscle relaxation rates, thereby contributing to the inotropic response elicited in heart by beta-agonists. The encoded protein is a key regulator of cardiac diastolic function. Mutations in this gene are a cause of inherited human dilated cardiomyopathy with refractory congestive heart failure, and also familial hypertrophic cardiomyopathy. | phospholamban | NA |
| ENSG00000197893 | NRAP | 4892 | NA | nebulin related anchoring protein | NA |
| ENSG00000064607 | SUGP2 | 10147 | This gene encodes a member of the arginine/serine-rich family of splicing factors. The encoded protein functions in mRNA processing. Alternatively spliced transcript variants have been described. | SURP and G-patch domain containing 2 | NA |
| ENSG00000237973 | MTCO1P12 | ENSG00000237973 | NA | MT-CO1 pseudogene 12 | NA |
| ENSG00000111640 | GAPDH | 2597 | This gene encodes a member of the glyceraldehyde-3-phosphate dehydrogenase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. The product of this gene catalyzes an important energy-yielding step in carbohydrate metabolism, the reversible oxidative phosphorylation of glyceraldehyde-3-phosphate in the presence of inorganic phosphate and nicotinamide adenine dinucleotide (NAD). The encoded protein has additionally been identified to have uracil DNA glycosylase activity in the nucleus. Also, this protein contains a peptide that has antimicrobial activity against E. coli, P. aeruginosa, and C. albicans. Studies of a similar protein in mouse have assigned a variety of additional functions including nitrosylation of nuclear proteins, the regulation of mRNA stability, and acting as a transferrin receptor on the cell surface of macrophage. Many pseudogenes similar to this locus are present in the human genome. Alternative splicing results in multiple transcript variants. | glyceraldehyde-3-phosphate dehydrogenase | NA |
| ENSG00000151729 | SLC25A4 | 291 | This gene is a member of the mitochondrial carrier subfamily of solute carrier protein genes. The product of this gene functions as a gated pore that translocates ADP from the cytoplasm into the mitochondrial matrix and ATP from the mitochondrial matrix into the cytoplasm. The protein forms a homodimer embedded in the inner mitochondria membrane. Mutations in this gene have been shown to result in autosomal dominant progressive external opthalmoplegia and familial hypertrophic cardiomyopathy. | solute carrier family 25 member 4 | NA |
| ENSG00000245532 | NEAT1 | 283131 | This gene produces a long non-coding RNA (lncRNA) transcribed from the multiple endocrine neoplasia locus. This lncRNA is retained in the nucleus where it forms the core structural component of the paraspeckle sub-organelles. It may act as a transcriptional regulator for numerous genes, including some genes involved in cancer progression. | nuclear paraspeckle assembly transcript 1 (non-protein coding) | NA |
| ENSG00000175646 | PRM1 | 5619 | NA | protamine 1 | NA |
| ENSG00000007237 | GAS7 | 8522 | Growth arrest-specific 7 is expressed primarily in terminally differentiated brain cells and predominantly in mature cerebellar Purkinje neurons. GAS7 plays a putative role in neuronal development. Several transcript variants encoding proteins which vary in the N-terminus have been described. | growth arrest specific 7 | NA |
| ENSG00000074800 | ENO1 | 2023 | This gene encodes alpha-enolase, one of three enolase isoenzymes found in mammals. Each isoenzyme is a homodimer composed of 2 alpha, 2 gamma, or 2 beta subunits, and functions as a glycolytic enzyme. Alpha-enolase in addition, functions as a structural lens protein (tau-crystallin) in the monomeric form. Alternative splicing of this gene results in a shorter isoform that has been shown to bind to the c-myc promoter and function as a tumor suppressor. Several pseudogenes have been identified, including one on the long arm of chromosome 1. Alpha-enolase has also been identified as an autoantigen in Hashimoto encephalopathy. | enolase 1 | NA |
| ENSG00000151552 | QDPR | 5860 | This gene encodes the enzyme dihydropteridine reductase, which catalyzes the NADH-mediated reduction of quinonoid dihydrobiopterin. This enzyme is an essential component of the pterin-dependent aromatic amino acid hydroxylating systems. Mutations in this gene resulting in QDPR deficiency include aberrant splicing, amino acid substitutions, insertions, or premature terminations. Dihydropteridine reductase deficiency presents as atypical phenylketonuria due to insufficient production of biopterin, a cofactor for phenylalanine hydroxylase. | quinoid dihydropteridine reductase | NA |
| ENSG00000068903 | SIRT2 | 22933 | This gene encodes a member of the sirtuin family of proteins, homologs to the yeast Sir2 protein. Members of the sirtuin family are characterized by a sirtuin core domain and grouped into four classes. The functions of human sirtuins have not yet been determined; however, yeast sirtuin proteins are known to regulate epigenetic gene silencing and suppress recombination of rDNA. Studies suggest that the human sirtuins may function as intracellular regulatory proteins with mono-ADP-ribosyltransferase activity. The protein encoded by this gene is included in class I of the sirtuin family. Several transcript variants are resulted from alternative splicing of this gene. | sirtuin 2 | NA |
| ENSG00000178814 | OPLAH | 26873 | The protein encoded by this gene acts as a homodimer, using ATP hydrolysis to catalyze the conversion of 5-oxo-L-proline to L-glutamate. Defects in this gene are a cause of 5-oxoprolinase deficiency (OPLAHD). | 5-oxoprolinase (ATP-hydrolysing) | NA |
| ENSG00000171223 | JUNB | 3726 | NA | JunB proto-oncogene, AP-1 transcription factor subunit | NA |
| ENSG00000095321 | CRAT | 1384 | This gene encodes carnitine acetyltransferase (CRAT), which is a key enzyme in the metabolic pathway in mitochondria, peroxisomes and endoplasmic reticulum. CRAT catalyzes the reversible transfer of acyl groups from an acyl-CoA thioester to carnitine and regulates the ratio of acylCoA/CoA in the subcellular compartments. Two transcript variants encoding different isoforms have been found for this gene. | carnitine O-acetyltransferase | NA |
| ENSG00000105290 | APLP1 | 333 | This gene encodes a member of the highly conserved amyloid precursor protein gene family. The encoded protein is a membrane-associated glycoprotein that is cleaved by secretases in a manner similar to amyloid beta A4 precursor protein cleavage. This cleavage liberates an intracellular cytoplasmic fragment that may act as a transcriptional activator. The encoded protein may also play a role in synaptic maturation during cortical development. Alternatively spliced transcript variants encoding different isoforms have been described. | amyloid beta precursor like protein 1 | NA |
| ENSG00000155980 | KIF5A | 3798 | This gene encodes a member of the kinesin family of proteins. Members of this family are part of a multisubunit complex that functions as a microtubule motor in intracellular organelle transport. Mutations in this gene cause autosomal dominant spastic paraplegia 10. | kinesin family member 5A | NA |
| ENSG00000129538 | RNASE1 | 6035 | This gene encodes a member of the pancreatic-type of secretory ribonucleases, a subset of the ribonuclease A superfamily. The encoded endonuclease cleaves internal phosphodiester RNA bonds on the 3’-side of pyrimidine bases. It prefers poly(C) as a substrate and hydrolyzes 2’,3’-cyclic nucleotides, with a pH optimum near 8.0. The encoded protein is monomeric and more commonly acts to degrade ds-RNA over ss-RNA. Alternative splicing occurs at this locus and four transcript variants encoding the same protein have been identified. | ribonuclease A family member 1, pancreatic | NA |
| ENSG00000198125 | MB | 4151 | This gene encodes a member of the globin superfamily and is expressed in skeletal and cardiac muscles. The encoded protein is a haemoprotein contributing to intracellular oxygen storage and transcellular facilitated diffusion of oxygen. At least three alternatively spliced transcript variants encoding the same protein have been reported. | myoglobin | NA |
| ENSG00000166925 | TSC22D4 | 81628 | TSC22D4 is a member of the TSC22 domain family of leucine zipper transcriptional regulators (see TSC22D3; MIM 300506) (Kester et al., 1999 [PubMed 10488076]; Fiorenza et al., 2001 [PubMed 11707329]). | TSC22 domain family member 4 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",7,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[8,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | X_id | query | name | summary | notfound |
|---|---|---|---|---|---|
| S100A9 | 6280 | ENSG00000163220 | S100 calcium binding protein A9 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and altered expression of this protein is associated with the disease cystic fibrosis. This antimicrobial protein exhibits antifungal and antibacterial activity. | NA |
| CSF3R | 1441 | ENSG00000119535 | colony stimulating factor 3 receptor | The protein encoded by this gene is the receptor for colony stimulating factor 3, a cytokine that controls the production, differentiation, and function of granulocytes. The encoded protein, which is a member of the family of cytokine receptors, may also function in some cell surface adhesion or recognition processes. Alternatively spliced transcript variants have been described. Mutations in this gene are a cause of Kostmann syndrome, also known as severe congenital neutropenia. | NA |
| IFITM2 | 10581 | ENSG00000185201 | interferon induced transmembrane protein 2 | NA | NA |
| SELL | 6402 | ENSG00000188404 | selectin L | This gene encodes a cell surface adhesion molecule that belongs to a family of adhesion/homing receptors. The encoded protein contains a C-type lectin-like domain, a calcium-binding epidermal growth factor-like domain, and two short complement-like repeats. The gene product is required for binding and subsequent rolling of leucocytes on endothelial cells, facilitating their migration into secondary lymphoid organs and inflammation sites. Single-nucleotide polymorphisms in this gene have been associated with various diseases including immunoglobulin A nephropathy. Alternatively spliced transcript variants have been found for this gene. | NA |
| FPR1 | 2357 | ENSG00000171051 | formyl peptide receptor 1 | This gene encodes a G protein-coupled receptor of mammalian phagocytic cells that is a member of the G-protein coupled receptor 1 family. The protein mediates the response of phagocytic cells to invasion of the host by microorganisms and is important in host defense and inflammation. | NA |
| LCP1 | 3936 | ENSG00000136167 | lymphocyte cytosolic protein 1 | Plastins are a family of actin-binding proteins that are conserved throughout eukaryote evolution and expressed in most tissues of higher eukaryotes. In humans, two ubiquitous plastin isoforms (L and T) have been identified. Plastin 1 (otherwise known as Fimbrin) is a third distinct plastin isoform which is specifically expressed at high levels in the small intestine. The L isoform is expressed only in hemopoietic cell lineages, while the T isoform has been found in all other normal cells of solid tissues that have replicative potential (fibroblasts, endothelial cells, epithelial cells, melanocytes, etc.). However, L-plastin has been found in many types of malignant human cells of non-hemopoietic origin suggesting that its expression is induced accompanying tumorigenesis in solid tissues. | NA |
| MMP25 | 64386 | ENSG00000008516 | matrix metallopeptidase 25 | Proteins of the matrix metalloproteinase (MMP) family are involved in the breakdown of extracellular matrix in normal physiological processes, such as embryonic development, reproduction, and tissue remodeling, as well as in disease processes, such as arthritis and metastasis. Most MMPs are secreted as inactive proproteins which are activated when cleaved by extracellular proteinases. However, the protein encoded by this gene is a member of the membrane-type MMP (MT-MMP) subfamily, attached to the plasma membrane via a glycosylphosphatidyl inositol anchor. In response to bacterial infection or inflammation, the encoded protein is thought to inactivate alpha-1 proteinase inhibitor, a major tissue protectant against proteolytic enzymes released by activated neutrophils, facilitating the transendothelial migration of neutrophils to inflammatory sites. The encoded protein may also play a role in tumor invasion and metastasis through activation of MMP2. The gene has previously been referred to as MMP20 but has been renamed MMP25. | NA |
| VNN2 | 8875 | ENSG00000112303 | vanin 2 | This gene product is a member of the Vanin family of proteins that share extensive sequence similarity with each other, and also with biotinidase. The family includes secreted and membrane-associated proteins, a few of which have been reported to participate in hematopoietic cell trafficking. No biotinidase activity has been demonstrated for any of the vanin proteins, however, they possess pantetheinase activity, which may play a role in oxidative-stress response. The encoded protein is a GPI-anchored cell surface molecule that plays a role in transendothelial migration of neutrophils. This gene lies in close proximity to, and in same transcriptional orientation as two other vanin genes on chromosome 6q23-q24. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | NA |
| IL1R2 | 7850 | ENSG00000115590 | interleukin 1 receptor type 2 | The protein encoded by this gene is a cytokine receptor that belongs to the interleukin 1 receptor family. This protein binds interleukin alpha (IL1A), interleukin beta (IL1B), and interleukin 1 receptor, type I(IL1R1/IL1RA), and acts as a decoy receptor that inhibits the activity of its ligands. Interleukin 4 (IL4) is reported to antagonize the activity of interleukin 1 by inducing the expression and release of this cytokine. This gene and three other genes form a cytokine receptor gene cluster on chromosome 2q12. Alternative splicing results in multiple transcript variants and protein isoforms. Alternative splicing produces both membrane-bound and soluble proteins. A soluble protein is also produced by proteolytic cleavage. | NA |
| FCGR3B | 2215 | ENSG00000162747 | Fc fragment of IgG receptor IIIb | The protein encoded by this gene is a low affinity receptor for the Fc region of gamma immunoglobulins (IgG). The encoded protein acts as a monomer and can bind either monomeric or aggregated IgG. This gene may function to capture immune complexes in the peripheral circulation. Several transcript variants encoding different isoforms have been found for this gene. A highly-similar gene encoding a related protein is also found on chromosome 1. | NA |
| C10orf54 | 64115 | ENSG00000107738 | chromosome 10 open reading frame 54 | NA | NA |
| AQP9 | 366 | ENSG00000103569 | aquaporin 9 | The aquaporins are a family of water-selective membrane channels. This gene encodes a member of a subset of aquaporins called the aquaglyceroporins. This protein allows passage of a broad range of noncharged solutes and also stimulates urea transport and osmotic water permeability. This protein may also facilitate the uptake of glycerol in hepatic tissue . The encoded protein may also play a role in specialized leukocyte functions such as immunological response and bactericidal activity. Alternate splicing results in multiple transcript variants. | NA |
| MNDA | 4332 | ENSG00000163563 | myeloid cell nuclear differentiation antigen | The myeloid cell nuclear differentiation antigen (MNDA) is detected only in nuclei of cells of the granulocyte-monocyte lineage. A 200-amino acid region of human MNDA is strikingly similar to a region in the proteins encoded by a family of interferon-inducible mouse genes, designated Ifi-201, Ifi-202, and Ifi-203, that are not regulated in a cell- or tissue-specific fashion. The 1.8-kb MNDA mRNA, which contains an interferon-stimulated response element in the 5-prime untranslated region, was significantly upregulated in human monocytes exposed to interferon alpha. MNDA is located within 2,200 kb of FCER1A, APCS, CRP, and SPTA1. In its pattern of expression and/or regulation, MNDA resembles IFI16, suggesting that these genes participate in blood cell-specific responses to interferons. | NA |
| SERPINA1 | 5265 | ENSG00000197249 | serpin family A member 1 | The protein encoded by this gene is secreted and is a serine protease inhibitor whose targets include elastase, plasmin, thrombin, trypsin, chymotrypsin, and plasminogen activator. Defects in this gene can cause emphysema or liver disease. Several transcript variants encoding the same protein have been found for this gene. | NA |
| S100A11 | 6282 | ENSG00000163191 | S100 calcium binding protein A11 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in motility, invasion, and tubulin polymerization. Chromosomal rearrangements and altered expression of this gene have been implicated in tumor metastasis. | NA |
| ALPL | 249 | ENSG00000162551 | alkaline phosphatase, liver/bone/kidney | This gene encodes a member of the alkaline phosphatase family of proteins. There are at least four distinct but related alkaline phosphatases: intestinal, placental, placental-like, and liver/bone/kidney (tissue non-specific). The first three are located together on chromosome 2, while the tissue non-specific form is located on chromosome 1. The product of this gene is a membrane bound glycosylated enzyme that is not expressed in any particular tissue and is, therefore, referred to as the tissue-nonspecific form of the enzyme. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature enzyme. This enzyme may play a role in bone mineralization. Mutations in this gene have been linked to hypophosphatasia, a disorder that is characterized by hypercalcemia and skeletal defects. | NA |
| CXCR1 | 3577 | ENSG00000163464 | C-X-C motif chemokine receptor 1 | The protein encoded by this gene is a member of the G-protein-coupled receptor family. This protein is a receptor for interleukin 8 (IL8). It binds to IL8 with high affinity, and transduces the signal through a G-protein activated second messenger system. Knockout studies in mice suggested that this protein inhibits embryonic oligodendrocyte precursor migration in developing spinal cord. This gene, IL8RB, a gene encoding another high affinity IL8 receptor, as well as IL8RBP, a pseudogene of IL8RB, form a gene cluster in a region mapped to chromosome 2q33-q36. | NA |
| MYH11 | 4629 | ENSG00000133392 | myosin, heavy chain 11, smooth muscle | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | NA |
| S100A8 | 6279 | ENSG00000143546 | S100 calcium binding protein A8 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and as a cytokine. Altered expression of this protein is associated with the disease cystic fibrosis. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| ARHGDIB | 397 | ENSG00000111348 | Rho GDP dissociation inhibitor beta | Members of the Rho (or ARH) protein family (see MIM 165390) and other Ras-related small GTP-binding proteins (see MIM 179520) are involved in diverse cellular events, including cell signaling, proliferation, cytoskeletal organization, and secretion. The GTP-binding proteins are active only in the GTP-bound state. At least 3 classes of proteins tightly regulate cycling between the GTP-bound and GDP-bound states: GTPase-activating proteins (GAPs), guanine nucleotide-releasing factors (GRFs), and GDP-dissociation inhibitors (GDIs). The GDIs, including ARHGDIB, decrease the rate of GDP dissociation from Ras-like GTPases (summary by Scherle et al., 1993 [PubMed 8356058]). | NA |
| LAPTM5 | 7805 | ENSG00000162511 | lysosomal protein transmembrane 5 | This gene encodes a transmembrane receptor that is associated with lysosomes. The encoded protein, also known as E3 protein, may play a role in hematopoiesis. | NA |
| MYO1F | 4542 | ENSG00000142347 | myosin IF | NA | NA |
| NCF2 | 4688 | ENSG00000116701 | neutrophil cytosolic factor 2 | This gene encodes neutrophil cytosolic factor 2, the 67-kilodalton cytosolic subunit of the multi-protein NADPH oxidase complex found in neutrophils. This oxidase produces a burst of superoxide which is delivered to the lumen of the neutrophil phagosome. Mutations in this gene, as well as in other NADPH oxidase subunits, can result in chronic granulomatous disease, a disease that causes recurrent infections by catalase-positive organisms. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| FCGR2A | 2212 | ENSG00000143226 | Fc fragment of IgG receptor IIa | This gene encodes one member of a family of immunoglobulin Fc receptor genes found on the surface of many immune response cells. The protein encoded by this gene is a cell surface receptor found on phagocytic cells such as macrophages and neutrophils, and is involved in the process of phagocytosis and clearing of immune complexes. Alternative splicing results in multiple transcript variants. | NA |
| FCGR2C | 9103 | ENSG00000143226 | Fc fragment of IgG receptor IIc (gene/pseudogene) | This gene encodes one of three members of a family of low-affinity immunoglobulin gamma Fc receptors found on the surface of many immune response cells. The encoded protein is a transmembrane glycoprotein and may be involved in phagocytosis and clearing of immune complexes. An allelic polymorphism in this gene results in both coding and non-coding variants. | NA |
| SRGN | 5552 | ENSG00000122862 | serglycin | This gene encodes a protein best known as a hematopoietic cell granule proteoglycan. Proteoglycans stored in the secretory granules of many hematopoietic cells also contain a protease-resistant peptide core, which may be important for neutralizing hydrolytic enzymes. This encoded protein was found to be associated with the macromolecular complex of granzymes and perforin, which may serve as a mediator of granule-mediated apoptosis. Two transcript variants, only one of them protein-coding, have been found for this gene. | NA |
| SMAP2 | 64744 | ENSG00000084070 | small ArfGAP2 | NA | NA |
| NAMPT | 10135 | ENSG00000105835 | nicotinamide phosphoribosyltransferase | This gene encodes a protein that catalyzes the condensation of nicotinamide with 5-phosphoribosyl-1-pyrophosphate to yield nicotinamide mononucleotide, one step in the biosynthesis of nicotinamide adenine dinucleotide. The protein belongs to the nicotinic acid phosphoribosyltransferase (NAPRTase) family and is thought to be involved in many important biological processes, including metabolism, stress response and aging. This gene has a pseudogene on chromosome 10. | NA |
| SLC11A1 | 6556 | ENSG00000018280 | solute carrier family 11 member 1 | This gene is a member of the solute carrier family 11 (proton-coupled divalent metal ion transporters) family and encodes a multi-pass membrane protein. The protein functions as a divalent transition metal (iron and manganese) transporter involved in iron metabolism and host resistance to certain pathogens. Mutations in this gene have been associated with susceptibility to infectious diseases such as tuberculosis and leprosy, and inflammatory diseases such as rheumatoid arthritis and Crohn disease. Alternatively spliced variants that encode different protein isoforms have been described but the full-length nature of only one has been determined. | NA |
| FGR | 2268 | ENSG00000000938 | FGR proto-oncogene, Src family tyrosine kinase | This gene is a member of the Src family of protein tyrosine kinases (PTKs). The encoded protein contains N-terminal sites for myristylation and palmitylation, a PTK domain, and SH2 and SH3 domains which are involved in mediating protein-protein interactions with phosphotyrosine-containing and proline-rich motifs, respectively. The protein localizes to plasma membrane ruffles, and functions as a negative regulator of cell migration and adhesion triggered by the beta-2 integrin signal transduction pathway. Infection with Epstein-Barr virus results in the overexpression of this gene. Multiple alternatively spliced variants, encoding the same protein, have been identified. | NA |
| MYL9 | 10398 | ENSG00000101335 | myosin light chain 9 | Myosin, a structural component of muscle, consists of two heavy chains and four light chains. The protein encoded by this gene is a myosin light chain that may regulate muscle contraction by modulating the ATPase activity of myosin heads. The encoded protein binds calcium and is activated by myosin light chain kinase. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| SELPLG | 6404 | ENSG00000110876 | selectin P ligand | This gene encodes a glycoprotein that functions as a high affinity counter-receptor for the cell adhesion molecules P-, E- and L- selectin expressed on myeloid cells and stimulated T lymphocytes. As such, this protein plays a critical role in leukocyte trafficking during inflammation by tethering of leukocytes to activated platelets or endothelia expressing selectins. This protein requires two post-translational modifications, tyrosine sulfation and the addition of the sialyl Lewis x tetrasaccharide (sLex) to its O-linked glycans, for its high-affinity binding activity. Aberrant expression of this gene and polymorphisms in this gene are associated with defects in the innate and adaptive immune response. Alternate splicing results in multiple transcript variants. | NA |
| MMP9 | 4318 | ENSG00000100985 | matrix metallopeptidase 9 | Proteins of the matrix metalloproteinase (MMP) family are involved in the breakdown of extracellular matrix in normal physiological processes, such as embryonic development, reproduction, and tissue remodeling, as well as in disease processes, such as arthritis and metastasis. Most MMP’s are secreted as inactive proproteins which are activated when cleaved by extracellular proteinases. The enzyme encoded by this gene degrades type IV and V collagens. Studies in rhesus monkeys suggest that the enzyme is involved in IL-8-induced mobilization of hematopoietic progenitor cells from bone marrow, and murine studies suggest a role in tumor-associated tissue remodeling. | NA |
| ALOX5AP | 241 | ENSG00000132965 | arachidonate 5-lipoxygenase activating protein | This gene encodes a protein which, with 5-lipoxygenase, is required for leukotriene synthesis. Leukotrienes are arachidonic acid metabolites which have been implicated in various types of inflammatory responses, including asthma, arthritis and psoriasis. This protein localizes to the plasma membrane. Inhibitors of its function impede translocation of 5-lipoxygenase from the cytoplasm to the cell membrane and inhibit 5-lipoxygenase activation. Alternatively spliced transcript variants encoding different isoforms have been identified for this gene. | NA |
| HCK | 3055 | ENSG00000101336 | HCK proto-oncogene, Src family tyrosine kinase | The protein encoded by this gene is a member of the Src family of tyrosine kinases. This protein is primarily hemopoietic, particularly in cells of the myeloid and B-lymphoid lineages. It may help couple the Fc receptor to the activation of the respiratory burst. In addition, it may play a role in neutrophil migration and in the degranulation of neutrophils. Multiple isoforms with different subcellular distributions are produced due to both alternative splicing and the use of alternative translation initiation codons, including a non-AUG (CUG) codon. | NA |
| SPI1 | 6688 | ENSG00000066336 | Spi-1 proto-oncogene | This gene encodes an ETS-domain transcription factor that activates gene expression during myeloid and B-lymphoid cell development. The nuclear protein binds to a purine-rich sequence known as the PU-box found near the promoters of target genes, and regulates their expression in coordination with other transcription factors and cofactors. The protein can also regulate alternative splicing of target genes. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| TIMP3 | 7078 | ENSG00000100234 | TIMP metallopeptidase inhibitor 3 | This gene belongs to the TIMP gene family. The proteins encoded by this gene family are inhibitors of the matrix metalloproteinases, a group of peptidases involved in degradation of the extracellular matrix (ECM). Expression of this gene is induced in response to mitogenic stimulation and this netrin domain-containing protein is localized to the ECM. Mutations in this gene have been associated with the autosomal dominant disorder Sorsby’s fundus dystrophy. | NA |
| ITGB2 | 3689 | ENSG00000160255 | integrin subunit beta 2 | This gene encodes an integrin beta chain, which combines with multiple different alpha chains to form different integrin heterodimers. Integrins are integral cell-surface proteins that participate in cell adhesion as well as cell-surface mediated signalling. The encoded protein plays an important role in immune response and defects in this gene cause leukocyte adhesion deficiency. Alternative splicing results in multiple transcript variants. | NA |
| S100A12 | 6283 | ENSG00000163221 | S100 calcium binding protein A12 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein is proposed to be involved in specific calcium-dependent signal transduction pathways and its regulatory effect on cytoskeletal components may modulate various neutrophil activities. The protein includes an antimicrobial peptide which has antibacterial activity. | NA |
| FLOT2 | 2319 | ENSG00000132589 | flotillin 2 | Caveolae are small domains on the inner cell membrane involved in vesicular trafficking and signal transduction. This gene encodes a caveolae-associated, integral membrane protein, which is thought to function in neuronal signaling. | NA |
| XPO6 | 23214 | ENSG00000169180 | exportin 6 | The protein encoded by this gene is a member of the importin-beta family. Members of this family are regulated by the GTPase Ran to mediate transport of cargo across the nuclear envelope. This protein has been shown to mediate nuclear export of profilin-actin complexes. A pseudogene of this gene is located on the long arm of chromosome 14. Alternative splicing results in multiple transcript variants that encode different protein isoforms. | NA |
| LYZ | 4069 | ENSG00000090382 | lysozyme | This gene encodes human lysozyme, whose natural substrate is the bacterial cell wall peptidoglycan (cleaving the beta[1-4]glycosidic linkages between N-acetylmuramic acid and N-acetylglucosamine). Lysozyme is one of the antimicrobial agents found in human milk, and is also present in spleen, lung, kidney, white blood cells, plasma, saliva, and tears. The protein has antibacterial activity against a number of bacterial species. Missense mutations in this gene have been identified in heritable renal amyloidosis. | NA |
| FCER1G | 2207 | ENSG00000158869 | Fc fragment of IgE receptor Ig | The high affinity IgE receptor is a key molecule involved in allergic reactions. It is a tetramer composed of 1 alpha, 1 beta, and 2 gamma chains. The gamma chains are also subunits of other Fc receptors. | NA |
| PGD | 5226 | ENSG00000142657 | phosphogluconate dehydrogenase | 6-phosphogluconate dehydrogenase is the second dehydrogenase in the pentose phosphate shunt. Deficiency of this enzyme is generally asymptomatic, and the inheritance of this disorder is autosomal dominant. Hemolysis results from combined deficiency of 6-phosphogluconate dehydrogenase and 6-phosphogluconolactonase suggesting a synergism of the two enzymopathies. Several transcript variants encoding different isoforms have been found for this gene. | NA |
| HLA-C | 3107 | ENSG00000204525 | major histocompatibility complex, class I, C | HLA-C belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. Class I molecules play a central role in the immune system by presenting peptides derived from endoplasmic reticulum lumen. They are expressed in nearly all cells. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon one encodes the leader peptide, exons 2 and 3 encode the alpha1 and alpha2 domain, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region, and exons 6 and 7 encode the cytoplasmic tail. Polymorphisms within exon 2 and exon 3 are responsible for the peptide binding specificity of each class one molecule. Typing for these polymorphisms is routinely done for bone marrow and kidney transplantation. Over one hundred HLA-C alleles have been described | NA |
| LILRA5 | 353514 | ENSG00000187116 | leukocyte immunoglobulin like receptor A5 | The protein encoded by this gene is a member of the leukocyte immunoglobulin-like receptor (LIR) family. LIR family members are known to have activating and inibitory functions in leukocytes. Crosslink of this receptor protein on the surface of monocytes has been shown to induce calcium flux and secretion of several proinflammatory cytokines, which suggests the roles of this protein in triggering innate immune responses. This gene is one of the leukocyte receptor genes that form a gene cluster on the chromosomal region 19q13.4. Four alternatively spliced transcript variants encoding distinct isoforms have been described. | NA |
| ITGAX | 3687 | ENSG00000140678 | integrin subunit alpha X | This gene encodes the integrin alpha X chain protein. Integrins are heterodimeric integral membrane proteins composed of an alpha chain and a beta chain. This protein combines with the beta 2 chain (ITGB2) to form a leukocyte-specific integrin referred to as inactivated-C3b (iC3b) receptor 4 (CR4). The alpha X beta 2 complex seems to overlap the properties of the alpha M beta 2 integrin in the adherence of neutrophils and monocytes to stimulated endothelium cells, and in the phagocytosis of complement coated particles. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| CALD1 | 800 | ENSG00000122786 | caldesmon 1 | This gene encodes a calmodulin- and actin-binding protein that plays an essential role in the regulation of smooth muscle and nonmuscle contraction. The conserved domain of this protein possesses the binding activities to Ca(2+)-calmodulin, actin, tropomyosin, myosin, and phospholipids. This protein is a potent inhibitor of the actin-tropomyosin activated myosin MgATPase, and serves as a mediating factor for Ca(2+)-dependent inhibition of smooth muscle contraction. Alternative splicing of this gene results in multiple transcript variants encoding distinct isoforms. | NA |
| HK3 | 3101 | ENSG00000160883 | hexokinase 3 | Hexokinases phosphorylate glucose to produce glucose-6-phosphate, the first step in most glucose metabolism pathways. This gene encodes hexokinase 3. Similar to hexokinases 1 and 2, this allosteric enzyme is inhibited by its product glucose-6-phosphate. | NA |
| HCLS1 | 3059 | ENSG00000180353 | hematopoietic cell-specific Lyn substrate 1 | NA | NA |
| CORO1A | 11151 | ENSG00000102879 | coronin 1A | This gene encodes a member of the WD repeat protein family. WD repeats are minimally conserved regions of approximately 40 amino acids typically bracketed by gly-his and trp-asp (GH-WD), which may facilitate formation of heterotrimeric or multiprotein complexes. Members of this family are involved in a variety of cellular processes, including cell cycle progression, signal transduction, apoptosis, and gene regulation. Alternative splicing results in multiple transcript variants. A related pseudogene has been defined on chromosome 16. | NA |
| HBB | 3043 | ENSG00000244734 | hemoglobin subunit beta | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | NA |
| CD53 | 963 | ENSG00000143119 | CD53 molecule | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. This encoded protein is a cell surface glycoprotein that is known to complex with integrins. It contributes to the transduction of CD2-generated signals in T cells and natural killer cells and has been suggested to play a role in growth regulation. Familial deficiency of this gene has been linked to an immunodeficiency associated with recurrent infectious diseases caused by bacteria, fungi and viruses. Alternative splicing results in multiple transcript variants. | NA |
| CST7 | 8530 | ENSG00000077984 | cystatin F | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins and the kininogens. The type 2 cystatin proteins are a class of cysteine proteinase inhibitors found in a variety of human fluids and secretions. This gene encodes a glycosylated cysteine protease inhibitor with a putative role in immune regulation through inhibition of a unique target in the hematopoietic system. Expression of the protein has been observed in various human cancer cell lines established from malignant tumors. | NA |
| GPX3 | 2878 | ENSG00000211445 | glutathione peroxidase 3 | This gene product belongs to the glutathione peroxidase family, which functions in the detoxification of hydrogen peroxide. It contains a selenocysteine (Sec) residue at its active site. The selenocysteine is encoded by the UGA codon, which normally signals translation termination. The 3’ UTR of Sec-containing genes have a common stem-loop structure, the sec insertion sequence (SECIS), which is necessary for the recognition of UGA as a Sec codon rather than as a stop signal. | NA |
| SORL1 | 6653 | ENSG00000137642 | sortilin-related receptor, L(DLR class) A repeats containing | This gene encodes a mosaic protein that belongs to at least two families: the vacuolar protein sorting 10 (VPS10) domain-containing receptor family, and the low density lipoprotein receptor (LDLR) family. The encoded protein also contains fibronectin type III repeats and an epidermal growth factor repeat. The encoded preproprotein is proteolytically processed to generate the mature receptor, which likely plays roles in endocytosis and sorting. Mutations in this gene may be associated with Alzheimer’s disease. | NA |
| NA | NA | ENSG00000259716 | NA | NA | TRUE |
| DCN | 1634 | ENSG00000011465 | decorin | This gene encodes a member of the small leucine-rich proteoglycan family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature protein. This protein plays a role in collagen fibril assembly. Binding of this protein to multiple cell surface receptors mediates its role in tumor suppression, including a stimulatory effect on autophagy and inflammation and an inhibitory effect on angiogenesis and tumorigenesis. This gene and the related gene biglycan are thought to be the result of a gene duplication. Mutations in this gene are associated with congenital stromal corneal dystrophy in human patients. | NA |
| ARRB2 | 409 | ENSG00000141480 | arrestin beta 2 | Members of arrestin/beta-arrestin protein family are thought to participate in agonist-mediated desensitization of G-protein-coupled receptors and cause specific dampening of cellular responses to stimuli such as hormones, neurotransmitters, or sensory signals. Arrestin beta 2, like arrestin beta 1, was shown to inhibit beta-adrenergic receptor function in vitro. It is expressed at high levels in the central nervous system and may play a role in the regulation of synaptic receptors. Besides the brain, a cDNA for arrestin beta 2 was isolated from thyroid gland, and thus it may also be involved in hormone-specific desensitization of TSH receptors. Multiple alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| GCA | 25801 | ENSG00000115271 | grancalcin | This gene product, grancalcin, is a calcium-binding protein abundant in neutrophils and macrophages. It belongs to the penta-EF-hand subfamily of proteins which includes sorcin, calpain, and ALG-2. Grancalcin localization is dependent upon calcium and magnesium. In the absence of divalent cation, grancalcin localizes to the cytosolic fraction; with magnesium alone, it partitions with the granule fraction; and in the presence of magnesium and calcium, it associates with both the granule and membrane fractions, suggesting a role for grancalcin in granule-membrane fusion and degranulation. | NA |
| DOK3 | 79930 | ENSG00000146094 | docking protein 3 | NA | NA |
| NCF4 | 4689 | ENSG00000100365 | neutrophil cytosolic factor 4 | The protein encoded by this gene is a cytosolic regulatory component of the superoxide-producing phagocyte NADPH-oxidase, a multicomponent enzyme system important for host defense. This protein is preferentially expressed in cells of myeloid lineage. It interacts primarily with neutrophil cytosolic factor 2 (NCF2/p67-phox) to form a complex with neutrophil cytosolic factor 1 (NCF1/p47-phox), which further interacts with the small G protein RAC1 and translocates to the membrane upon cell stimulation. This complex then activates flavocytochrome b, the membrane-integrated catalytic core of the enzyme system. The PX domain of this protein can bind phospholipid products of the PI(3) kinase, which suggests its role in PI(3) kinase-mediated signaling events. The phosphorylation of this protein was found to negatively regulate the enzyme activity. Alternatively spliced transcript variants encoding distinct isoforms have been observed. | NA |
| TNS1 | 7145 | ENSG00000079308 | tensin 1 | The protein encoded by this gene localizes to focal adhesions, regions of the plasma membrane where the cell attaches to the extracellular matrix. This protein crosslinks actin filaments and contains a Src homology 2 (SH2) domain, which is often found in molecules involved in signal transduction. This protein is a substrate of calpain II. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| ICAM3 | 3385 | ENSG00000076662 | intercellular adhesion molecule 3 | The protein encoded by this gene is a member of the intercellular adhesion molecule (ICAM) family. All ICAM proteins are type I transmembrane glycoproteins, contain 2-9 immunoglobulin-like C2-type domains, and bind to the leukocyte adhesion LFA-1 protein. This protein is constitutively and abundantly expressed by all leucocytes and may be the most important ligand for LFA-1 in the initiation of the immune response. It functions not only as an adhesion molecule, but also as a potent signalling molecule. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| PYGL | 5836 | ENSG00000100504 | phosphorylase, glycogen, liver | This gene encodes a homodimeric protein that catalyses the cleavage of alpha-1,4-glucosidic bonds to release glucose-1-phosphate from liver glycogen stores. This protein switches from inactive phosphorylase B to active phosphorylase A by phosphorylation of serine residue 15. Activity of this enzyme is further regulated by multiple allosteric effectors and hormonal controls. Humans have three glycogen phosphorylase genes that encode distinct isozymes that are primarily expressed in liver, brain and muscle, respectively. The liver isozyme serves the glycemic demands of the body in general while the brain and muscle isozymes supply just those tissues. In glycogen storage disease type VI, also known as Hers disease, mutations in liver glycogen phosphorylase inhibit the conversion of glycogen to glucose and results in moderate hypoglycemia, mild ketosis, growth retardation and hepatomegaly. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| HLA-B | 3106 | ENSG00000234745 | major histocompatibility complex, class I, B | HLA-B belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. Class I molecules play a central role in the immune system by presenting peptides derived from the endoplasmic reticulum lumen. They are expressed in nearly all cells. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon 1 encodes the leader peptide, exon 2 and 3 encode the alpha1 and alpha2 domains, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region and exons 6 and 7 encode the cytoplasmic tail. Polymorphisms within exon 2 and exon 3 are responsible for the peptide binding specificity of each class one molecule. Typing for these polymorphisms is routinely done for bone marrow and kidney transplantation. Hundreds of HLA-B alleles have been described. | NA |
| RAC2 | 5880 | ENSG00000128340 | ras-related C3 botulinum toxin substrate 2 (rho family, small GTP binding protein Rac2) | This gene encodes a member of the Ras superfamily of small guanosine triphosphate (GTP)-metabolizing proteins. The encoded protein localizes to the plasma membrane, where it regulates diverse processes, such as secretion, phagocytosis, and cell polarization. Activity of this protein is also involved in the generation of reactive oxygen species. Mutations in this gene are associated with neutrophil immunodeficiency syndrome. There is a pseudogene for this gene on chromosome 6. | NA |
| TYROBP | 7305 | ENSG00000011600 | TYRO protein tyrosine kinase binding protein | This gene encodes a transmembrane signaling polypeptide which contains an immunoreceptor tyrosine-based activation motif (ITAM) in its cytoplasmic domain. The encoded protein may associate with the killer-cell inhibitory receptor (KIR) family of membrane glycoproteins and may act as an activating signal transduction element. This protein may bind zeta-chain (TCR) associated protein kinase 70kDa (ZAP-70) and spleen tyrosine kinase (SYK) and play a role in signal transduction, bone modeling, brain myelination, and inflammation. Mutations within this gene have been associated with polycystic lipomembranous osteodysplasia with sclerosing leukoencephalopathy (PLOSL), also known as Nasu-Hakola disease. Its putative receptor, triggering receptor expressed on myeloid cells 2 (TREM2), also causes PLOSL. Multiple alternative transcript variants encoding distinct isoforms have been identified for this gene. | NA |
| ABTB1 | 80325 | ENSG00000114626 | ankyrin repeat and BTB domain containing 1 | This gene encodes a protein with an ankyrin repeat region and two BTB/POZ domains, which are thought to be involved in protein-protein interactions. Expression of this gene is activated by the phosphatase and tensin homolog, a tumor suppressor. Alternate splicing results in three transcript variants. | NA |
| HLA-E | 3133 | ENSG00000204592 | major histocompatibility complex, class I, E | HLA-E belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. HLA-E binds a restricted subset of peptides derived from the leader peptides of other class I molecules. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon one encodes the leader peptide, exons 2 and 3 encode the alpha1 and alpha2 domains, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region, and exons 6 and 7 encode the cytoplasmic tail. | NA |
| ACSL1 | 2180 | ENSG00000151726 | acyl-CoA synthetase long-chain family member 1 | The protein encoded by this gene is an isozyme of the long-chain fatty-acid-coenzyme A ligase family. Although differing in substrate specificity, subcellular localization, and tissue distribution, all isozymes of this family convert free long-chain fatty acids into fatty acyl-CoA esters, and thereby play a key role in lipid biosynthesis and fatty acid degradation. Several transcript variants encoding different isoforms have been found for this gene. | NA |
| MSRB1 | 51734 | ENSG00000198736 | methionine sulfoxide reductase B1 | This gene encodes a selenoprotein, which contains a selenocysteine (Sec) residue at its active site. The selenocysteine is encoded by the UGA codon that normally signals translation termination. The 3’ UTR of selenoprotein genes have a common stem-loop structure, the sec insertion sequence (SECIS), that is necessary for the recognition of UGA as a Sec codon rather than as a stop signal. This protein belongs to the methionine sulfoxide reductase (Msr) protein family which includes repair enzymes that reduce oxidized methionine residues in proteins. The protein encoded by this gene is expressed in a variety of adult and fetal tissues and localizes to the cell nucleus and cytosol. | NA |
| GSN | 2934 | ENSG00000148180 | gelsolin | The protein encoded by this gene binds to the ‘plus’ ends of actin monomers and filaments to prevent monomer exchange. The encoded calcium-regulated protein functions in both assembly and disassembly of actin filaments. Defects in this gene are a cause of familial amyloidosis Finnish type (FAF). Multiple transcript variants encoding several different isoforms have been found for this gene. | NA |
| GPSM3 | 63940 | ENSG00000213654 | G-protein signaling modulator 3 | NA | NA |
| FAM65B | 9750 | ENSG00000111913 | family with sequence similarity 65 member B | The protein encoded by this gene stimulates the formation of a non-mitotic multinucleate syncytium from proliferative cytotrophoblasts during trophoblast differentiation. Alternative splicing of this gene results in multiple transcript variants. | NA |
| SERPINB1 | 1992 | ENSG00000021355 | serpin family B member 1 | The protein encoded by this gene is a member of the serpin family of proteinase inhibitors. Members of this family maintain homeostasis by neutralizing overexpressed proteinase activity through their function as suicide substrates. This protein inhibits the neutrophil-derived proteinases neutrophil elastase, cathepsin G, and proteinase-3 and thus protects tissues from damage at inflammatory sites. Alternative splicing results in multiple transcript variants. | NA |
| COL6A2 | 1292 | ENSG00000142173 | collagen type VI alpha 2 | This gene encodes one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The product of this gene contains several domains similar to von Willebrand Factor type A domains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in this gene are associated with Bethlem myopathy and Ullrich scleroatonic muscular dystrophy. Three transcript variants have been identified for this gene. | NA |
| CD177 | 57126 | ENSG00000204936 | CD177 molecule | This gene encodes a glycosyl-phosphatidylinositol (GPI)-linked cell surface glycoprotein that plays a role in neutrophil activation. The protein can bind platelet endothelial cell adhesion molecule-1 and function in neutrophil transmigration. Mutations in this gene are associated with myeloproliferative diseases. Over-expression of this gene has been found in patients with polycythemia rubra vera. Autoantibodies against the protein may result in pulmonary transfusion reactions, and it may be involved in Wegener’s granulomatosis. A related pseudogene, which is adjacent to this gene on chromosome 19, has been identified. | NA |
| PTRF | 284119 | ENSG00000177469 | polymerase I and transcript release factor | This gene encodes a protein that enables the dissociation of paused ternary polymerase I transcription complexes from the 3’ end of pre-rRNA transcripts. This protein regulates rRNA transcription by promoting the dissociation of transcription complexes and the reinitiation of polymerase I on nascent rRNA transcripts. This protein also localizes to caveolae at the plasma membrane and is thought to play a critical role in the formation of caveolae and the stabilization of caveolins. This protein translocates from caveolae to the cytoplasm after insulin stimulation. Caveolae contain truncated forms of this protein and may be the site of phosphorylation-dependent proteolysis. This protein is also thought to modify lipid metabolism and insulin-regulated gene expression. Mutations in this gene result in a disorder characterized by generalized lipodystrophy and muscular dystrophy. | NA |
| COTL1 | 23406 | ENSG00000103187 | coactosin like F-actin binding protein 1 | This gene encodes one of the numerous actin-binding proteins which regulate the actin cytoskeleton. This protein binds F-actin, and also interacts with 5-lipoxygenase, which is the first committed enzyme in leukotriene biosynthesis. Although this gene has been reported to map to chromosome 17 in the Smith-Magenis syndrome region, the best alignments for this gene are to chromosome 16. The Smith-Magenis syndrome region is the site of two related pseudogenes. | NA |
| TNFRSF10C | 8794 | ENSG00000173535 | tumor necrosis factor receptor superfamily member 10c | The protein encoded by this gene is a member of the TNF-receptor superfamily. This receptor contains an extracellular TRAIL-binding domain and a transmembrane domain, but no cytoplasmic death domain. This receptor is not capable of inducing apoptosis, and is thought to function as an antagonistic receptor that protects cells from TRAIL-induced apoptosis. This gene was found to be a p53-regulated DNA damage-inducible gene. The expression of this gene was detected in many normal tissues but not in most cancer cell lines, which may explain the specific sensitivity of cancer cells to the apoptosis-inducing activity of TRAIL. | NA |
| LITAF | 9516 | ENSG00000189067 | lipopolysaccharide induced TNF factor | Lipopolysaccharide is a potent stimulator of monocytes and macrophages, causing secretion of tumor necrosis factor-alpha (TNF-alpha) and other inflammatory mediators. This gene encodes lipopolysaccharide-induced TNF-alpha factor, which is a DNA-binding protein and can mediate the TNF-alpha expression by direct binding to the promoter region of the TNF-alpha gene. The transcription of this gene is induced by tumor suppressor p53 and has been implicated in the p53-induced apoptotic pathway. Mutations in this gene cause Charcot-Marie-Tooth disease type 1C (CMT1C) and may be involved in the carcinogenesis of extramammary Paget’s disease (EMPD). Multiple alternatively spliced transcript variants have been found for this gene. | NA |
| TLR2 | 7097 | ENSG00000137462 | toll like receptor 2 | The protein encoded by this gene is a member of the Toll-like receptor (TLR) family which plays a fundamental role in pathogen recognition and activation of innate immunity. TLRs are highly conserved from Drosophila to humans and share structural and functional similarities. This protein is a cell-surface protein that can form heterodimers with other TLR family members to recognize conserved molecules derived from microorganisms known as pathogen-associated molecular patterns (PAMPs). Activation of TLRs by PAMPs leads to an up-regulation of signaling pathways to modulate the host’s inflammatory response. This gene is also thought to promote apoptosis in response to bacterial lipoproteins. This gene has been implicated in the pathogenesis of several autoimmune diseases. Alternative splicing results in multiple transcript variants. | NA |
| RHOG | 391 | ENSG00000177105 | ras homolog family member G | This gene encodes a member of the Rho family of small GTPases, which cycle between inactive GDP-bound and active GTP-bound states and function as molecular switches in signal transduction cascades. Rho proteins promote reorganization of the actin cytoskeleton and regulate cell shape, attachment, and motility. The encoded protein facilitates translocation of a functional guanine nucleotide exchange factor (GEF) complex from the cytoplasm to the plasma membrane where ras-related C3 botulinum toxin substrate 1 is activated to promote lamellipodium formation and cell migration. Two related pseudogene have been identified on chromosomes 20 and X. | NA |
| IL18RAP | 8807 | ENSG00000115607 | interleukin 18 receptor accessory protein | The protein encoded by this gene is an accessory subunit of the heterodimeric receptor for interleukin 18 (IL18), a proinflammatory cytokine involved in inducing cell-mediated immunity. This protein enhances the IL18-binding activity of the IL18 receptor and plays a role in signaling by IL18. Mutations in this gene are associated with Crohn’s disease and inflammatory bowel disease, and susceptibility to celiac disease and leprosy. Alternatively spliced transcript variants of this gene have been described, but their full-length nature is not known. | NA |
| HSPG2 | 3339 | ENSG00000142798 | heparan sulfate proteoglycan 2 | This gene encodes the perlecan protein, which consists of a core protein to which three long chains of glycosaminoglycans (heparan sulfate or chondroitin sulfate) are attached. The perlecan protein is a large multidomain proteoglycan that binds to and cross-links many extracellular matrix components and cell-surface molecules. It has been shown that this protein interacts with laminin, prolargin, collagen type IV, FGFBP1, FBLN2, FGF7 and transthyretin, etc., and it plays essential roles in multiple biological activities. Perlecan is a key component of the vascular extracellular matrix, where it helps to maintain the endothelial barrier function. It is a potent inhibitor of smooth muscle cell proliferation and is thus thought to help maintain vascular homeostasis. It can also promote growth factor (e.g., FGF2) activity and thus stimulate endothelial growth and re-generation. It is a major component of basement membranes, where it is involved in the stabilization of other molecules as well as being involved with glomerular permeability to macromolecules and cell adhesion. Mutations in this gene cause Schwartz-Jampel syndrome type 1, Silverman-Handmaker type of dyssegmental dysplasia, and tardive dyskinesia. Alternative splicing of this gene results in multiple transcript variants. | NA |
| AHNAK | 79026 | ENSG00000124942 | AHNAK nucleoprotein | NA | NA |
| BASP1 | 10409 | ENSG00000176788 | brain abundant membrane attached signal protein 1 | This gene encodes a membrane bound protein with several transient phosphorylation sites and PEST motifs. Conservation of proteins with PEST sequences among different species supports their functional significance. PEST sequences typically occur in proteins with high turnover rates. Immunological characteristics of this protein are species specific. This protein also undergoes N-terminal myristoylation. Alternative splicing results in multiple transcript variants that encode the same protein. | NA |
| PLBD1 | 79887 | ENSG00000121316 | phospholipase B domain containing 1 | NA | NA |
| NBEAL2 | 23218 | ENSG00000160796 | neurobeachin like 2 | The protein encoded by this gene contains a beige and Chediak-Higashi (BEACH) domain and multiple WD40 domains, and may play a role in megakaryocyte alpha-granule biogenesis. Mutations in this gene are a cause of gray platelet syndrome. | NA |
| RASSF2 | 9770 | ENSG00000101265 | Ras association domain family member 2 | This gene encodes a protein that contains a Ras association domain. Similar to its cattle and sheep counterparts, this gene is located near the prion gene. Two alternatively spliced transcripts encoding the same isoform have been reported. | NA |
| MMP25-AS1 | ENSG00000261971 | ENSG00000261971 | MMP25 antisense RNA 1 | NA | NA |
| LRG1 | 116844 | ENSG00000171236 | leucine rich alpha-2-glycoprotein 1 | The leucine-rich repeat (LRR) family of proteins, including LRG1, have been shown to be involved in protein-protein interaction, signal transduction, and cell adhesion and development. LRG1 is expressed during granulocyte differentiation (O’Donnell et al., 2002 [PubMed 12223515]). | NA |
| SLA | 6503 | ENSG00000155926 | Src-like-adaptor | NA | NA |
| SHKBP1 | 92799 | ENSG00000160410 | SH3KBP1 binding protein 1 | NA | NA |
| SERPING1 | 710 | ENSG00000149131 | serpin family G member 1 | This gene encodes a highly glycosylated plasma protein involved in the regulation of the complement cascade. Its protein inhibits activated C1r and C1s of the first complement component and thus regulates complement activation. Deficiency of this protein is associated with hereditary angioneurotic oedema (HANE). Alternative splicing results in multiple transcript variants encoding the same isoform. | NA |
| MYLK | 4638 | ENSG00000065534 | myosin light chain kinase | This gene, a muscle member of the immunoglobulin gene superfamily, encodes myosin light chain kinase which is a calcium/calmodulin dependent enzyme. This kinase phosphorylates myosin regulatory light chains to facilitate myosin interaction with actin filaments to produce contractile activity. This gene encodes both smooth muscle and nonmuscle isoforms. In addition, using a separate promoter in an intron in the 3’ region, it encodes telokin, a small protein identical in sequence to the C-terminus of myosin light chain kinase, that is independently expressed in smooth muscle and functions to stabilize unphosphorylated myosin filaments. A pseudogene is located on the p arm of chromosome 3. Four transcript variants that produce four isoforms of the calcium/calmodulin dependent enzyme have been identified as well as two transcripts that produce two isoforms of telokin. Additional variants have been identified but lack full length transcripts. | NA |
| ITM2B | 9445 | ENSG00000136156 | integral membrane protein 2B | Amyloid precursor proteins are processed by beta-secretase and gamma-secretase to produce beta-amyloid peptides which form the characteristic plaques of Alzheimer disease. This gene encodes a transmembrane protein which is processed at the C-terminus by furin or furin-like proteases to produce a small secreted peptide which inhibits the deposition of beta-amyloid. Mutations which result in extension of the C-terminal end of the encoded protein, thereby increasing the size of the secreted peptide, are associated with two neurogenerative diseases, familial British dementia and familial Danish dementia. | NA |
| RHOB | 388 | ENSG00000143878 | ras homolog family member B | NA | NA |
| CAP1 | 10487 | ENSG00000131236 | CAP, adenylate cyclase-associated protein 1 (yeast) | The protein encoded by this gene is related to the S. cerevisiae CAP protein, which is involved in the cyclic AMP pathway. The human protein is able to interact with other molecules of the same protein, as well as with CAP2 and actin. Alternatively spliced transcript variants have been identified. | NA |
| TALDO1 | 6888 | ENSG00000177156 | transaldolase 1 | Transaldolase 1 is a key enzyme of the nonoxidative pentose phosphate pathway providing ribose-5-phosphate for nucleic acid synthesis and NADPH for lipid biosynthesis. This pathway can also maintain glutathione at a reduced state and thus protect sulfhydryl groups and cellular integrity from oxygen radicals. The functional gene of transaldolase 1 is located on chromosome 11 and a pseudogene is identified on chromosome 1 but there are conflicting map locations. The second and third exon of this gene were developed by insertion of a retrotransposable element. This gene is thought to be involved in multiple sclerosis. | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",8,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[9,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| query | X_id | name | summary | symbol |
|---|---|---|---|---|
| ENSG00000175084 | 1674 | desmin | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | DES |
| ENSG00000019582 | 972 | CD74 molecule | The protein encoded by this gene associates with class II major histocompatibility complex (MHC) and is an important chaperone that regulates antigen presentation for immune response. It also serves as cell surface receptor for the cytokine macrophage migration inhibitory factor (MIF) which, when bound to the encoded protein, initiates survival pathways and cell proliferation. This protein also interacts with amyloid precursor protein (APP) and suppresses the production of amyloid beta (Abeta). Multiple alternatively spliced transcript variants encoding different isoforms have been identified. | CD74 |
| ENSG00000197971 | 4155 | myelin basic protein | The protein encoded by the classic MBP gene is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. However, MBP-related transcripts are also present in the bone marrow and the immune system. These mRNAs arise from the long MBP gene (otherwise called ‘Golli-MBP’) that contains 3 additional exons located upstream of the classic MBP exons. Alternative splicing from the Golli and the MBP transcription start sites gives rise to 2 sets of MBP-related transcripts and gene products. The Golli mRNAs contain 3 exons unique to Golli-MBP, spliced in-frame to 1 or more MBP exons. They encode hybrid proteins that have N-terminal Golli aa sequence linked to MBP aa sequence. The second family of transcripts contain only MBP exons and produce the well characterized myelin basic proteins. This complex gene structure is conserved among species suggesting that the MBP transcription unit is an integral part of the Golli transcription unit and that this arrangement is important for the function and/or regulation of these genes. | MBP |
| ENSG00000166710 | 567 | beta-2-microglobulin | This gene encodes a serum protein found in association with the major histocompatibility complex (MHC) class I heavy chain on the surface of nearly all nucleated cells. The protein has a predominantly beta-pleated sheet structure that can form amyloid fibrils in some pathological conditions. The encoded antimicrobial protein displays antibacterial activity in amniotic fluid. A mutation in this gene has been shown to result in hypercatabolic hypoproteinemia. | B2M |
| ENSG00000244734 | 3043 | hemoglobin subunit beta | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | HBB |
| ENSG00000204287 | 3122 | major histocompatibility complex, class II, DR alpha | HLA-DRA is one of the HLA class II alpha chain paralogues. This class II molecule is a heterodimer consisting of an alpha and a beta chain, both anchored in the membrane. It plays a central role in the immune system by presenting peptides derived from extracellular proteins. Class II molecules are expressed in antigen presenting cells (APC: B lymphocytes, dendritic cells, macrophages). The alpha chain is approximately 33-35 kDa and its gene contains 5 exons. Exon 1 encodes the leader peptide, exons 2 and 3 encode the two extracellular domains, and exon 4 encodes the transmembrane domain and the cytoplasmic tail. DRA does not have polymorphisms in the peptide binding part and acts as the sole alpha chain for DRB1, DRB3, DRB4 and DRB5. | HLA-DRA |
| ENSG00000198467 | 7169 | tropomyosin 2 (beta) | This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | TPM2 |
| ENSG00000204592 | 3133 | major histocompatibility complex, class I, E | HLA-E belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. HLA-E binds a restricted subset of peptides derived from the leader peptides of other class I molecules. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon one encodes the leader peptide, exons 2 and 3 encode the alpha1 and alpha2 domains, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region, and exons 6 and 7 encode the cytoplasmic tail. | HLA-E |
| ENSG00000128591 | 2318 | filamin C | This gene encodes one of three related filamin genes, specifically gamma filamin. These filamin proteins crosslink actin filaments into orthogonal networks in cortical cytoplasm and participate in the anchoring of membrane proteins for the actin cytoskeleton. Three functional domains exist in filamin: an N-terminal filamentous actin-binding domain, a C-terminal self-association domain, and a membrane glycoprotein-binding domain. Two transcript variants encoding different isoforms have been found for this gene. | FLNC |
| ENSG00000184009 | 71 | actin gamma 1 | Actins are highly conserved proteins that are involved in various types of cell motility, and maintenance of the cytoskeleton. In vertebrates, three main groups of actin isoforms, alpha, beta and gamma have been identified. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton, and as mediators of internal cell motility. Actin, gamma 1, encoded by this gene, is a cytoplasmic actin found in non-muscle cells. Mutations in this gene are associated with DFNA20/26, a subtype of autosomal dominant non-syndromic sensorineural progressive hearing loss. Alternative splicing results in multiple transcript variants. | ACTG1 |
| ENSG00000188536 | 3040 | hemoglobin subunit alpha 2 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | HBA2 |
| ENSG00000204983 | 5644 | protease, serine 1 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | PRSS1 |
| ENSG00000091704 | 1357 | carboxypeptidase A1 | This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. | CPA1 |
| ENSG00000204525 | 3107 | major histocompatibility complex, class I, C | HLA-C belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. Class I molecules play a central role in the immune system by presenting peptides derived from endoplasmic reticulum lumen. They are expressed in nearly all cells. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon one encodes the leader peptide, exons 2 and 3 encode the alpha1 and alpha2 domain, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region, and exons 6 and 7 encode the cytoplasmic tail. Polymorphisms within exon 2 and exon 3 are responsible for the peptide binding specificity of each class one molecule. Typing for these polymorphisms is routinely done for bone marrow and kidney transplantation. Over one hundred HLA-C alleles have been described | HLA-C |
| ENSG00000234745 | 3106 | major histocompatibility complex, class I, B | HLA-B belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. Class I molecules play a central role in the immune system by presenting peptides derived from the endoplasmic reticulum lumen. They are expressed in nearly all cells. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon 1 encodes the leader peptide, exon 2 and 3 encode the alpha1 and alpha2 domains, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region and exons 6 and 7 encode the cytoplasmic tail. Polymorphisms within exon 2 and exon 3 are responsible for the peptide binding specificity of each class one molecule. Typing for these polymorphisms is routinely done for bone marrow and kidney transplantation. Hundreds of HLA-B alleles have been described. | HLA-B |
| ENSG00000172403 | 171024 | synaptopodin 2 | NA | SYNPO2 |
| ENSG00000231389 | 3113 | major histocompatibility complex, class II, DP alpha 1 | HLA-DPA1 belongs to the HLA class II alpha chain paralogues. This class II molecule is a heterodimer consisting of an alpha (DPA) and a beta (DPB) chain, both anchored in the membrane. It plays a central role in the immune system by presenting peptides derived from extracellular proteins. Class II molecules are expressed in antigen presenting cells (APC: B lymphocytes, dendritic cells, macrophages). The alpha chain is approximately 33-35 kDa and its gene contains 5 exons. Exon one encodes the leader peptide, exons 2 and 3 encode the two extracellular domains, exon 4 encodes the transmembrane domain and the cytoplasmic tail. Within the DP molecule both the alpha chain and the beta chain contain the polymorphisms specifying the peptide binding specificities, resulting in up to 4 different molecules. | HLA-DPA1 |
| ENSG00000169347 | 2813 | glycoprotein 2 | This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. | GP2 |
| ENSG00000100316 | 6122 | ribosomal protein L3 | Ribosomes, the complexes that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L3P family of ribosomal proteins and it is located in the cytoplasm. The protein can bind to the HIV-1 TAR mRNA, and it has been suggested that the protein contributes to tat-mediated transactivation. This gene is co-transcribed with several small nucleolar RNA genes, which are located in several of this gene’s introns. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | RPL3 |
| ENSG00000137154 | 6194 | ribosomal protein S6 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a cytoplasmic ribosomal protein that is a component of the 40S subunit. The protein belongs to the S6E family of ribosomal proteins. It is the major substrate of protein kinases in the ribosome, with subsets of five C-terminal serine residues phosphorylated by different protein kinases. Phosphorylation is induced by a wide range of stimuli, including growth factors, tumor-promoting agents, and mitogens. Dephosphorylation occurs at growth arrest. The protein may contribute to the control of cell growth and proliferation through the selective translation of particular classes of mRNA. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | RPS6 |
| ENSG00000155657 | 7273 | titin | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | TTN |
| ENSG00000175535 | 5406 | pancreatic lipase | This gene is a member of the lipase gene family. It encodes a carboxyl esterase that hydrolyzes insoluble, emulsified triglycerides, and is essential for the efficient digestion of dietary fats. This gene is expressed specifically in the pancreas. | PNLIP |
| ENSG00000140416 | 7168 | tropomyosin 1 (alpha) | This gene is a member of the tropomyosin family of highly conserved, widely distributed actin-binding proteins involved in the contractile system of striated and smooth muscles and the cytoskeleton of non-muscle cells. Tropomyosin is composed of two alpha-helical chains arranged as a coiled-coil. It is polymerized end to end along the two grooves of actin filaments and provides stability to the filaments. The encoded protein is one type of alpha helical chain that forms the predominant tropomyosin of striated muscle, where it also functions in association with the troponin complex to regulate the calcium-dependent interaction of actin and myosin during muscle contraction. In smooth muscle and non-muscle cells, alternatively spliced transcript variants encoding a range of isoforms have been described. Mutations in this gene are associated with type 3 familial hypertrophic cardiomyopathy. | TPM1 |
| ENSG00000166165 | 1152 | creatine kinase B | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in brain as well as in other tissues, and as a heterodimer with a similar muscle isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. A pseudogene of this gene has been characterized. | CKB |
| ENSG00000142789 | 10136 | chymotrypsin like elastase family member 3A | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. | CELA3A |
| ENSG00000170835 | 1056 | carboxyl ester lipase | The protein encoded by this gene is a glycoprotein secreted from the pancreas into the digestive tract and from the lactating mammary gland into human milk. The physiological role of this protein is in cholesterol and lipid-soluble vitamin ester hydrolysis and absorption. This encoded protein promotes large chylomicron production in the intestine. Also its presence in plasma suggests its interactions with cholesterol and oxidized lipoproteins to modulate the progression of atherosclerosis. In pancreatic tumoral cells, this encoded protein is thought to be sequestrated within the Golgi compartment and is probably not secreted. This gene contains a variable number of tandem repeat (VNTR) polymorphism in the coding region that may influence the function of the encoded protein. | CEL |
| ENSG00000122862 | 5552 | serglycin | This gene encodes a protein best known as a hematopoietic cell granule proteoglycan. Proteoglycans stored in the secretory granules of many hematopoietic cells also contain a protease-resistant peptide core, which may be important for neutralizing hydrolytic enzymes. This encoded protein was found to be associated with the macromolecular complex of granzymes and perforin, which may serve as a mediator of granule-mediated apoptosis. Two transcript variants, only one of them protein-coding, have been found for this gene. | SRGN |
| ENSG00000153002 | 1360 | carboxypeptidase B1 | Three different procarboxypeptidases A and two different procarboxypeptidases B have been isolated. The B1 and B2 forms differ from each other mainly in isoelectric point. Carboxypeptidase B1 is a highly tissue-specific protein and is a useful serum marker for acute pancreatitis and dysfunction of pancreatic transplants. It is not elevated in pancreatic carcinoma. | CPB1 |
| ENSG00000196126 | 3123 | major histocompatibility complex, class II, DR beta 1 | HLA-DRB1 belongs to the HLA class II beta chain paralogs. The class II molecule is a heterodimer consisting of an alpha (DRA) and a beta chain (DRB), both anchored in the membrane. It plays a central role in the immune system by presenting peptides derived from extracellular proteins. Class II molecules are expressed in antigen presenting cells (APC: B lymphocytes, dendritic cells, macrophages). The beta chain is approximately 26-28 kDa. It is encoded by 6 exons. Exon one encodes the leader peptide; exons 2 and 3 encode the two extracellular domains; exon 4 encodes the transmembrane domain; and exon 5 encodes the cytoplasmic tail. Within the DR molecule the beta chain contains all the polymorphisms specifying the peptide binding specificities. Hundreds of DRB1 alleles have been described and typing for these polymorphisms is routinely done for bone marrow and kidney transplantation. DRB1 is expressed at a level five times higher than its paralogs DRB3, DRB4 and DRB5. DRB1 is present in all individuals. Allelic variants of DRB1 are linked with either none or one of the genes DRB3, DRB4 and DRB5. There are 4 related pseudogenes: DRB2, DRB6, DRB7, DRB8 and DRB9. | HLA-DRB1 |
| ENSG00000196126 | 105369230 | HLA class II histocompatibility antigen, DRB1-7 beta chain | NA | LOC105369230 |
| ENSG00000074800 | 2023 | enolase 1 | This gene encodes alpha-enolase, one of three enolase isoenzymes found in mammals. Each isoenzyme is a homodimer composed of 2 alpha, 2 gamma, or 2 beta subunits, and functions as a glycolytic enzyme. Alpha-enolase in addition, functions as a structural lens protein (tau-crystallin) in the monomeric form. Alternative splicing of this gene results in a shorter isoform that has been shown to bind to the c-myc promoter and function as a tumor suppressor. Several pseudogenes have been identified, including one on the long arm of chromosome 1. Alpha-enolase has also been identified as an autoantigen in Hashimoto encephalopathy. | ENO1 |
| ENSG00000162511 | 7805 | lysosomal protein transmembrane 5 | This gene encodes a transmembrane receptor that is associated with lysosomes. The encoded protein, also known as E3 protein, may play a role in hematopoiesis. | LAPTM5 |
| ENSG00000266844 | ENSG00000266844 | NA | NA | RP11-862L9.3 |
| ENSG00000156508 | 1915 | eukaryotic translation elongation factor 1 alpha 1 | This gene encodes an isoform of the alpha subunit of the elongation factor-1 complex, which is responsible for the enzymatic delivery of aminoacyl tRNAs to the ribosome. This isoform (alpha 1) is expressed in brain, placenta, lung, liver, kidney, and pancreas, and the other isoform (alpha 2) is expressed in brain, heart and skeletal muscle. This isoform is identified as an autoantigen in 66% of patients with Felty syndrome. This gene has been found to have multiple copies on many chromosomes, some of which, if not all, represent different pseudogenes. | EEF1A1 |
| ENSG00000231500 | 6222 | ribosomal protein S18 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S13P family of ribosomal proteins. It is located in the cytoplasm. The gene product of the E. coli ortholog (ribosomal protein S13) is involved in the binding of fMet-tRNA, and thus, in the initiation of translation. This gene is an ortholog of mouse Ke3. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | RPS18 |
| ENSG00000120885 | 1191 | clusterin | The protein encoded by this gene is a secreted chaperone that can under some stress conditions also be found in the cell cytosol. It has been suggested to be involved in several basic biological events such as cell death, tumor progression, and neurodegenerative disorders. Alternate splicing results in both coding and non-coding variants. | CLU |
| ENSG00000206503 | 3105 | major histocompatibility complex, class I, A | HLA-A belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. Class I molecules play a central role in the immune system by presenting peptides derived from the endoplasmic reticulum lumen. They are expressed in nearly all cells. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon 1 encodes the leader peptide, exons 2 and 3 encode the alpha1 and alpha2 domains, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region, and exons 6 and 7 encode the cytoplasmic tail. Polymorphisms within exon 2 and exon 3 are responsible for the peptide binding specificity of each class one molecule. Typing for these polymorphisms is routinely done for bone marrow and kidney transplantation. Hundreds of HLA-A alleles have been described. | HLA-A |
| ENSG00000180353 | 3059 | hematopoietic cell-specific Lyn substrate 1 | NA | HCLS1 |
| ENSG00000101335 | 10398 | myosin light chain 9 | Myosin, a structural component of muscle, consists of two heavy chains and four light chains. The protein encoded by this gene is a myosin light chain that may regulate muscle contraction by modulating the ATPase activity of myosin heads. The encoded protein binds calcium and is activated by myosin light chain kinase. Two transcript variants encoding different isoforms have been found for this gene. | MYL9 |
| ENSG00000059804 | 6515 | solute carrier family 2 member 3 | NA | SLC2A3 |
| ENSG00000185303 | 729238 | surfactant protein A2 | This gene is one of several genes encoding pulmonary-surfactant associated proteins (SFTPA) located on chromosome 10. Mutations in this gene and a highly similar gene located nearby, which affect the highly conserved carbohydrate recognition domain, are associated with idiopathic pulmonary fibrosis. The current version of the assembly displays only a single centromeric SFTPA gene pair rather than the two gene pairs shown in the previous assembly which were thought to have resulted from a duplication. | SFTPA2 |
| ENSG00000143119 | 963 | CD53 molecule | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. This encoded protein is a cell surface glycoprotein that is known to complex with integrins. It contributes to the transduction of CD2-generated signals in T cells and natural killer cells and has been suggested to play a role in growth regulation. Familial deficiency of this gene has been linked to an immunodeficiency associated with recurrent infectious diseases caused by bacteria, fungi and viruses. Alternative splicing results in multiple transcript variants. | CD53 |
| ENSG00000115386 | 5967 | regenerating family member 1 alpha | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | REG1A |
| ENSG00000197321 | 6840 | supervillin | This gene encodes a bipartite protein with distinct amino- and carboxy-terminal domains. The amino-terminus contains nuclear localization signals and the carboxy-terminus contains numerous consecutive sequences with extensive similarity to proteins in the gelsolin family of actin-binding proteins, which cap, nucleate, and/or sever actin filaments. The gene product is tightly associated with both actin filaments and plasma membranes, suggesting a role as a high-affinity link between the actin cytoskeleton and the membrane. The encoded protein appears to aid in both myosin II assembly during cell spreading and disassembly of focal adhesions. Several transcript variants encoding different isoforms of supervillin have been described. | SVIL |
| ENSG00000165795 | 57447 | NDRG family member 2 | This gene is a member of the N-myc downregulated gene family which belongs to the alpha/beta hydrolase superfamily. The protein encoded by this gene is a cytoplasmic protein that may play a role in neurite outgrowth. This gene may be involved in glioblastoma carcinogenesis. Several alternatively spliced transcript variants of this gene have been described, but the full-length nature of some of these variants has not been determined. | NDRG2 |
| ENSG00000110719 | 10312 | T-cell immune regulator 1, ATPase H+ transporting V0 subunit a3 | Through alternate splicing, this gene encodes two proteins with similarity to subunits of the vacuolar ATPase (V-ATPase) but the encoded proteins seem to have different functions. V-ATPase is a multisubunit enzyme that mediates acidification of eukaryotic intracellular organelles. V-ATPase dependent organelle acidification is necessary for such intracellular processes as protein sorting, zymogen activation, and receptor-mediated endocytosis. V-ATPase is comprised of a cytosolic V1 domain and a transmembrane V0 domain. Mutations in this gene are associated with infantile malignant osteopetrosis. | TCIRG1 |
| ENSG00000075624 | 60 | actin, beta | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | ACTB |
| ENSG00000113140 | 6678 | secreted protein acidic and cysteine rich | This gene encodes a cysteine-rich acidic matrix-associated protein. The encoded protein is required for the collagen in bone to become calcified but is also involved in extracellular matrix synthesis and promotion of changes to cell shape. The gene product has been associated with tumor suppression but has also been correlated with metastasis based on changes to cell shape which can promote tumor cell invasion. Three transcript variants encoding different isoforms have been found for this gene. | SPARC |
| ENSG00000023445 | 330 | baculoviral IAP repeat containing 3 | This gene encodes a member of the IAP family of proteins that inhibit apoptosis by binding to tumor necrosis factor receptor-associated factors TRAF1 and TRAF2, probably by interfering with activation of ICE-like proteases. The encoded protein inhibits apoptosis induced by serum deprivation but does not affect apoptosis resulting from exposure to menadione, a potent inducer of free radicals. It contains 3 baculovirus IAP repeats and a ring finger domain. Transcript variants encoding the same isoform have been identified. | BIRC3 |
| ENSG00000142937 | 6202 | ribosomal protein S8 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S8E family of ribosomal proteins. It is located in the cytoplasm. Increased expression of this gene in colorectal tumors and colon polyps compared to matched normal colonic mucosa has been observed. This gene is co-transcribed with the small nucleolar RNA genes U38A, U38B, U39, and U40, which are located in its fourth, fifth, first, and second introns, respectively. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | RPS8 |
| ENSG00000211896 | ENSG00000211896 | immunoglobulin heavy constant gamma 1 (G1m marker) | NA | IGHG1 |
| ENSG00000092054 | 4625 | myosin, heavy chain 7, cardiac muscle, beta | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | MYH7 |
| ENSG00000131095 | 2670 | glial fibrillary acidic protein | This gene encodes one of the major intermediate filament proteins of mature astrocytes. It is used as a marker to distinguish astrocytes from other glial cells during development. Mutations in this gene cause Alexander disease, a rare disorder of astrocytes in the central nervous system. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | GFAP |
| ENSG00000158710 | 8407 | transgelin 2 | The protein encoded by this gene is similar to the protein transgelin, which is one of the earliest markers of differentiated smooth muscle. The specific function of this protein has not yet been determined, although it is thought to be a tumor suppressor. Multiple transcript variants encoding different isoforms have been found for this gene. | TAGLN2 |
| ENSG00000122852 | 653509 | surfactant protein A1 | This gene encodes a lung surfactant protein that is a member of a subfamily of C-type lectins called collectins. The encoded protein binds specific carbohydrate moieties found on lipids and on the surface of microorganisms. This protein plays an essential role in surfactant homeostasis and in the defense against respiratory pathogens. Mutations in this gene are associated with idiopathic pulmonary fibrosis. Alternate splicing results in multiple transcript variants. | SFTPA1 |
| ENSG00000137392 | 1208 | colipase | The protein encoded by this gene is a cofactor needed by pancreatic lipase for efficient dietary lipid hydrolysis. It binds to the C-terminal, non-catalytic domain of lipase, thereby stabilizing an active conformation and considerably increasing the overall hydrophobic binding site. The gene product allows lipase to anchor noncovalently to the surface of lipid micelles, counteracting the destabilizing influence of intestinal bile salts. This cofactor is only expressed in pancreatic acinar cells, suggesting regulation of expression by tissue-specific elements. Three transcript variants encoding different isoforms have been found for this gene. | CLPS |
| ENSG00000118503 | 7128 | TNF alpha induced protein 3 | This gene was identified as a gene whose expression is rapidly induced by the tumor necrosis factor (TNF). The protein encoded by this gene is a zinc finger protein and ubiqitin-editing enzyme, and has been shown to inhibit NF-kappa B activation as well as TNF-mediated apoptosis. The encoded protein, which has both ubiquitin ligase and deubiquitinase activities, is involved in the cytokine-mediated immune and inflammatory responses. Several transcript variants encoding the same protein have been found for this gene. | TNFAIP3 |
| ENSG00000103187 | 23406 | coactosin like F-actin binding protein 1 | This gene encodes one of the numerous actin-binding proteins which regulate the actin cytoskeleton. This protein binds F-actin, and also interacts with 5-lipoxygenase, which is the first committed enzyme in leukotriene biosynthesis. Although this gene has been reported to map to chromosome 17 in the Smith-Magenis syndrome region, the best alignments for this gene are to chromosome 16. The Smith-Magenis syndrome region is the site of two related pseudogenes. | COTL1 |
| ENSG00000143947 | 6233 | ribosomal protein S27a | Ubiquitin, a highly conserved protein that has a major role in targeting cellular proteins for degradation by the 26S proteosome, is synthesized as a precursor protein consisting of either polyubiquitin chains or a single ubiquitin fused to an unrelated protein. This gene encodes a fusion protein consisting of ubiquitin at the N terminus and ribosomal protein S27a at the C terminus. When expressed in yeast, the protein is post-translationally processed, generating free ubiquitin monomer and ribosomal protein S27a. Ribosomal protein S27a is a component of the 40S subunit of the ribosome and belongs to the S27AE family of ribosomal proteins. It contains C4-type zinc finger domains and is located in the cytoplasm. Pseudogenes derived from this gene are present in the genome. As with ribosomal protein S27a, ribosomal protein L40 is also synthesized as a fusion protein with ubiquitin; similarly, ribosomal protein S30 is synthesized as a fusion protein with the ubiquitin-like protein fubi. Multiple alternatively spliced transcript variants that encode the same proteins have been identified. | RPS27A |
| ENSG00000156804 | 114907 | F-box protein 32 | This gene encodes a member of the F-box protein family which is characterized by an approximately 40 amino acid motif, the F-box. The F-box proteins constitute one of the four subunits of the ubiquitin protein ligase complex called SCFs (SKP1-cullin-F-box), which function in phosphorylation-dependent ubiquitination. The F-box proteins are divided into 3 classes: Fbws containing WD-40 domains, Fbls containing leucine-rich repeats, and Fbxs containing either different protein-protein interaction modules or no recognizable motifs. The protein encoded by this gene belongs to the Fbxs class and contains an F-box domain. This protein is highly expressed during muscle atrophy, whereas mice deficient in this gene were found to be resistant to atrophy. This protein is thus a potential drug target for the treatment of muscle atrophy. Alternative splicing results in multiple transcript variants encoding different isoforms. | FBXO32 |
| ENSG00000223865 | 3115 | major histocompatibility complex, class II, DP beta 1 | HLA-DPB belongs to the HLA class II beta chain paralogues. This class II molecule is a heterodimer consisting of an alpha (DPA) and a beta chain (DPB), both anchored in the membrane. It plays a central role in the immune system by presenting peptides derived from extracellular proteins. Class II molecules are expressed in antigen presenting cells (APC: B lymphocytes, dendritic cells, macrophages). The beta chain is approximately 26-28 kDa and its gene contains 6 exons. Exon one encodes the leader peptide, exons 2 and 3 encode the two extracellular domains, exon 4 encodes the transmembrane domain and exon 5 encodes the cytoplasmic tail. Within the DP molecule both the alpha chain and the beta chain contain the polymorphisms specifying the peptide binding specificities, resulting in up to 4 different molecules. | HLA-DPB1 |
| ENSG00000102879 | 11151 | coronin 1A | This gene encodes a member of the WD repeat protein family. WD repeats are minimally conserved regions of approximately 40 amino acids typically bracketed by gly-his and trp-asp (GH-WD), which may facilitate formation of heterotrimeric or multiprotein complexes. Members of this family are involved in a variety of cellular processes, including cell cycle progression, signal transduction, apoptosis, and gene regulation. Alternative splicing results in multiple transcript variants. A related pseudogene has been defined on chromosome 16. | CORO1A |
| ENSG00000206172 | 3039 | hemoglobin subunit alpha 1 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | HBA1 |
| ENSG00000173641 | 27129 | heat shock protein family B (small) member 7 | NA | HSPB7 |
| ENSG00000178104 | 9659 | phosphodiesterase 4D interacting protein | The protein encoded by this gene serves to anchor phosphodiesterase 4D to the Golgi/centrosome region of the cell. Defects in this gene may be a cause of myeloproliferative disorder (MBD) associated with eosinophilia. Several transcript variants encoding different isoforms have been found for this gene. | PDE4DIP |
| ENSG00000163131 | 1520 | cathepsin S | The protein encoded by this gene, a member of the peptidase C1 family, is a lysosomal cysteine proteinase that may participate in the degradation of antigenic proteins to peptides for presentation on MHC class II molecules. The encoded protein can function as an elastase over a broad pH range in alveolar macrophages. Alternatively spliced transcript variants encoding distinct isoforms have been found for this gene. | CTSS |
| ENSG00000089157 | 6175 | ribosomal protein lateral stalk subunit P0 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein, which is the functional equivalent of the E. coli L10 ribosomal protein, belongs to the L10P family of ribosomal proteins. It is a neutral phosphoprotein with a C-terminal end that is nearly identical to the C-terminal ends of the acidic ribosomal phosphoproteins P1 and P2. The P0 protein can interact with P1 and P2 to form a pentameric complex consisting of P1 and P2 dimers, and a P0 monomer. The protein is located in the cytoplasm. Transcript variants derived from alternative splicing exist; they encode the same protein. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | RPLP0 |
| ENSG00000180354 | 222166 | maturin, neural progenitor differentiation regulator homolog (Xenopus) | NA | MTURN |
| ENSG00000168928 | 440387 | chymotrypsinogen B2 | NA | CTRB2 |
| ENSG00000090104 | 5996 | regulator of G-protein signaling 1 | This gene encodes a member of the regulator of G-protein signalling family. This protein is located on the cytosolic side of the plasma membrane and contains a conserved, 120 amino acid motif called the RGS domain. The protein attenuates the signalling activity of G-proteins by binding to activated, GTP-bound G alpha subunits and acting as a GTPase activating protein (GAP), increasing the rate of conversion of the GTP to GDP. This hydrolysis allows the G alpha subunits to bind G beta/gamma subunit heterodimers, forming inactive G-protein heterotrimers, thereby terminating the signal. | RGS1 |
| ENSG00000142676 | 6135 | ribosomal protein L11 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L5P family of ribosomal proteins. It is located in the cytoplasm. The protein probably associates with the 5S rRNA. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | RPL11 |
| ENSG00000168028 | 3921 | ribosomal protein SA | Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Many of the effects of laminin are mediated through interactions with cell surface receptors. These receptors include members of the integrin family, as well as non-integrin laminin-binding proteins. This gene encodes a high-affinity, non-integrin family, laminin receptor 1. This receptor has been variously called 67 kD laminin receptor, 37 kD laminin receptor precursor (37LRP) and p40 ribosome-associated protein. The amino acid sequence of laminin receptor 1 is highly conserved through evolution, suggesting a key biological function. It has been observed that the level of the laminin receptor transcript is higher in colon carcinoma tissue and lung cancer cell line than their normal counterparts. Also, there is a correlation between the upregulation of this polypeptide in cancer cells and their invasive and metastatic phenotype. Multiple copies of this gene exist, however, most of them are pseudogenes thought to have arisen from retropositional events. Two alternatively spliced transcript variants encoding the same protein have been found for this gene. | RPSA |
| ENSG00000021300 | 58473 | pleckstrin homology domain containing B1 | NA | PLEKHB1 |
| ENSG00000211899 | ENSG00000211899 | immunoglobulin heavy constant mu | NA | IGHM |
| ENSG00000219073 | 23436 | chymotrypsin like elastase family member 3B | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3B has little elastolytic activity. Like most of the human elastases, elastase 3B is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3B preferentially cleaves proteins after alanine residues. Elastase 3B may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1, and excretion of this protein in fecal material is frequently used as a measure of pancreatic function in clinical assays. | CELA3B |
| ENSG00000115415 | 6772 | signal transducer and activator of transcription 1 | The protein encoded by this gene is a member of the STAT protein family. In response to cytokines and growth factors, STAT family members are phosphorylated by the receptor associated kinases, and then form homo- or heterodimers that translocate to the cell nucleus where they act as transcription activators. This protein can be activated by various ligands including interferon-alpha, interferon-gamma, EGF, PDGF and IL6. This protein mediates the expression of a variety of genes, which is thought to be important for cell viability in response to different cell stimuli and pathogens. Two alternatively spliced transcript variants encoding distinct isoforms have been described. | STAT1 |
| ENSG00000109472 | 1363 | carboxypeptidase E | This gene encodes a member of the M14 family of metallocarboxypeptidases. The encoded preproprotein is proteolytically processed to generate the mature peptidase. This peripheral membrane protein cleaves C-terminal amino acid residues and is involved in the biosynthesis of peptide hormones and neurotransmitters, including insulin. This protein may also function independently of its peptidase activity, as a neurotrophic factor that promotes neuronal survival, and as a sorting receptor that binds to regulated secretory pathway proteins, including prohormones. Mutations in this gene are implicated in type 2 diabetes. | CPE |
| ENSG00000078804 | 58476 | tumor protein p53 inducible nuclear protein 2 | NA | TP53INP2 |
| ENSG00000095637 | 10580 | sorbin and SH3 domain containing 1 | This gene encodes a CBL-associated protein which functions in the signaling and stimulation of insulin. Mutations in this gene may be associated with human disorders of insulin resistance. Alternative splicing results in multiple transcript variants. | SORBS1 |
| ENSG00000168925 | 1504 | chymotrypsinogen B1 | The protein encoded by this gene is one of a family of serine proteases that is secreted into the gastrointestinal tract as an inactive precursor, which is activated by proteolytic cleavage with trypsin. | CTRB1 |
| ENSG00000160255 | 3689 | integrin subunit beta 2 | This gene encodes an integrin beta chain, which combines with multiple different alpha chains to form different integrin heterodimers. Integrins are integral cell-surface proteins that participate in cell adhesion as well as cell-surface mediated signalling. The encoded protein plays an important role in immune response and defects in this gene cause leukocyte adhesion deficiency. Alternative splicing results in multiple transcript variants. | ITGB2 |
| ENSG00000090339 | 3383 | intercellular adhesion molecule 1 | This gene encodes a cell surface glycoprotein which is typically expressed on endothelial cells and cells of the immune system. It binds to integrins of type CD11a / CD18, or CD11b / CD18 and is also exploited by Rhinovirus as a receptor. | ICAM1 |
| ENSG00000157601 | 4599 | MX dynamin like GTPase 1 | This gene encodes a guanosine triphosphate (GTP)-metabolizing protein that participates in the cellular antiviral response. The encoded protein is induced by type I and type II interferons and antagonizes the replication process of several different RNA and DNA viruses. There is a related gene located adjacent to this gene on chromosome 21, and there are multiple pseudogenes located in a cluster on chromosome 4. Alternative splicing results in multiple transcript variants. | MX1 |
| ENSG00000105372 | 6223 | ribosomal protein S19 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S19E family of ribosomal proteins. It is located in the cytoplasm. Mutations in this gene cause Diamond-Blackfan anemia (DBA), a constitutional erythroblastopenia characterized by absent or decreased erythroid precursors, in a subset of patients. This suggests a possible extra-ribosomal function for this gene in erythropoietic differentiation and proliferation, in addition to its ribosomal function. Higher expression levels of this gene in some primary colon carcinomas compared to matched normal colon tissues has been observed. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | RPS19 |
| ENSG00000104879 | 1158 | creatine kinase, M-type | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis and is an important serum marker for myocardial infarction. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in striated muscle as well as in other tissues, and as a heterodimer with a similar brain isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. | CKM |
| ENSG00000109846 | 1410 | crystallin alpha B | Mammalian lens crystallins are divided into alpha, beta, and gamma families. Alpha crystallins are composed of two gene products: alpha-A and alpha-B, for acidic and basic, respectively. Alpha crystallins can be induced by heat shock and are members of the small heat shock protein (HSP20) family. They act as molecular chaperones although they do not renature proteins and release them in the fashion of a true chaperone; instead they hold them in large soluble aggregates. Post-translational modifications decrease the ability to chaperone. These heterogeneous aggregates consist of 30-40 subunits; the alpha-A and alpha-B subunits have a 3:1 ratio, respectively. Two additional functions of alpha crystallins are an autokinase activity and participation in the intracellular architecture. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. Alpha-A and alpha-B gene products are differentially expressed; alpha-A is preferentially restricted to the lens and alpha-B is expressed widely in many tissues and organs. Elevated expression of alpha-B crystallin occurs in many neurological diseases; a missense mutation cosegregated in a family with a desmin-related myopathy. Alternative splicing results in multiple transcript variants. | CRYAB |
| ENSG00000196205 | ENSG00000196205 | eukaryotic translation elongation factor 1 alpha 1 pseudogene 5 | NA | EEF1A1P5 |
| ENSG00000168484 | 6440 | surfactant protein C | This gene encodes the pulmonary-associated surfactant protein C (SPC), an extremely hydrophobic surfactant protein essential for lung function and homeostasis after birth. Pulmonary surfactant is a surface-active lipoprotein complex composed of 90% lipids and 10% proteins which include plasma proteins and apolipoproteins SPA, SPB, SPC and SPD. The surfactant is secreted by the alveolar cells of the lung and maintains the stability of pulmonary tissue by reducing the surface tension of fluids that coat the lung. Multiple mutations in this gene have been identified, which cause pulmonary surfactant metabolism dysfunction type 2, also called pulmonary alveolar proteinosis due to surfactant protein C deficiency, and are associated with interstitial lung disease in older infants, children, and adults. Alternatively spliced transcript variants encoding different protein isoforms have been identified. | SFTPC |
| ENSG00000140853 | 84166 | NLR family CARD domain containing 5 | This gene encodes a member of the caspase recruitment domain-containing NLR family. This gene plays a role in cytokine response and antiviral immunity through its inhibition of NF-kappa-B activation and negative regulation of type I interferon signaling pathways. | NLRC5 |
| ENSG00000149273 | 6188 | ribosomal protein S3 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit, where it forms part of the domain where translation is initiated. The protein belongs to the S3P family of ribosomal proteins. Studies of the mouse and rat proteins have demonstrated that the protein has an extraribosomal role as an endonuclease involved in the repair of UV-induced DNA damage. The protein appears to be located in both the cytoplasm and nucleus but not in the nucleolus. Higher levels of expression of this gene in colon adenocarcinomas and adenomatous polyps compared to adjacent normal colonic mucosa have been observed. This gene is co-transcribed with the small nucleolar RNA genes U15A and U15B, which are located in its first and fifth introns, respectively. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Multiple alternatively spliced transcript variants encoding different isoforms have been found for this gene. | RPS3 |
| ENSG00000111679 | 5777 | protein tyrosine phosphatase, non-receptor type 6 | The protein encoded by this gene is a member of the protein tyrosine phosphatase (PTP) family. PTPs are known to be signaling molecules that regulate a variety of cellular processes including cell growth, differentiation, mitotic cycle, and oncogenic transformation. N-terminal part of this PTP contains two tandem Src homolog (SH2) domains, which act as protein phospho-tyrosine binding domains, and mediate the interaction of this PTP with its substrates. This PTP is expressed primarily in hematopoietic cells, and functions as an important regulator of multiple signaling pathways in hematopoietic cells. This PTP has been shown to interact with, and dephosphorylate a wide spectrum of phospho-proteins involved in hematopoietic cell signaling. Multiple alternatively spliced variants of this gene, which encode distinct isoforms, have been reported. | PTPN6 |
| ENSG00000103811 | 1512 | cathepsin H | The protein encoded by this gene is a lysosomal cysteine proteinase important in the overall degradation of lysosomal proteins. It is composed of a dimer of disulfide-linked heavy and light chains, both produced from a single protein precursor. The encoded protein, which belongs to the peptidase C1 protein family, can act both as an aminopeptidase and as an endopeptidase. Increased expression of this gene has been correlated with malignant progression of prostate tumors. Alternate splicing of this gene results in multiple transcript variants encoding different isoforms. | CTSH |
| ENSG00000130294 | 547 | kinesin family member 1A | The protein encoded by this gene is a member of the kinesin family and functions as an anterograde motor protein that transports membranous organelles along axonal microtubules. Mutations at this locus have been associated with spastic paraplegia-30 and hereditary sensory neuropathy IIC. Alternatively spliced transcript variants encoding distinct isoforms have been described. | KIF1A |
| ENSG00000198668 | 801 | calmodulin 1 (phosphorylase kinase, delta) | This gene encodes a member of the EF-hand calcium-binding protein family. It is one of three genes which encode an identical calcium binding protein which is one of the four subunits of phosphorylase kinase. Two pseudogenes have been identified on chromosome 7 and X. Multiple transcript variants encoding different isoforms have been found for this gene. | CALM1 |
| ENSG00000198668 | 805 | calmodulin 2 (phosphorylase kinase, delta) | This gene is a member of the calmodulin gene family. There are three distinct calmodulin genes dispersed throughout the genome that encode the identical protein, but differ at the nucleotide level. Calmodulin is a calcium binding protein that plays a role in signaling pathways, cell cycle progression and proliferation. Several infants with severe forms of long-QT syndrome (LQTS) who displayed life-threatening ventricular arrhythmias together with delayed neurodevelopment and epilepsy were found to have mutations in either this gene or another member of the calmodulin gene family (PMID:23388215). Mutations in this gene have also been identified in patients with less severe forms of LQTS (PMID:24917665), while mutations in another calmodulin gene family member have been associated with catecholaminergic polymorphic ventricular tachycardia (CPVT)(PMID:23040497), a rare disorder thought to be the cause of a significant fraction of sudden cardiac deaths in young individuals. Pseudogenes of this gene are found on chromosomes 10, 13, and 17. Alternative splicing results in multiple transcript variants encoding different isoforms. | CALM2 |
| ENSG00000162734 | 8682 | phosphoprotein enriched in astrocytes 15 | This gene encodes a death effector domain-containing protein that functions as a negative regulator of apoptosis. The encoded protein is an endogenous substrate for protein kinase C. This protein is also overexpressed in type 2 diabetes mellitus, where it may contribute to insulin resistance in glucose uptake. Alternative splicing results in multiple transcript variants. | PEA15 |
| ENSG00000130303 | 684 | bone marrow stromal cell antigen 2 | Bone marrow stromal cells are involved in the growth and development of B-cells. The specific function of the protein encoded by the bone marrow stromal cell antigen 2 is undetermined; however, this protein may play a role in pre-B-cell growth and in rheumatoid arthritis. | BST2 |
| ENSG00000198125 | 4151 | myoglobin | This gene encodes a member of the globin superfamily and is expressed in skeletal and cardiac muscles. The encoded protein is a haemoprotein contributing to intracellular oxygen storage and transcellular facilitated diffusion of oxygen. At least three alternatively spliced transcript variants encoding the same protein have been reported. | MB |
| ENSG00000152661 | 2697 | gap junction protein alpha 1 | This gene is a member of the connexin gene family. The encoded protein is a component of gap junctions, which are composed of arrays of intercellular channels that provide a route for the diffusion of low molecular weight materials from cell to cell. The encoded protein is the major protein of gap junctions in the heart that are thought to have a crucial role in the synchronized contraction of the heart and in embryonic development. A related intronless pseudogene has been mapped to chromosome 5. Mutations in this gene have been associated with oculodentodigital dysplasia, autosomal recessive craniometaphyseal dysplasia and heart malformations. | GJA1 |
| ENSG00000166963 | 4130 | microtubule associated protein 1A | This gene encodes a protein that belongs to the microtubule-associated protein family. The proteins of this family are thought to be involved in microtubule assembly, which is an essential step in neurogenesis. The product of this gene is a precursor polypeptide that presumably undergoes proteolytic processing to generate the final MAP1A heavy chain and LC2 light chain. Expression of this gene is almost exclusively in the brain. Studies of the rat microtubule-associated protein 1A gene suggested a role in early events of spinal cord development. | MAP1A |
| ENSG00000187514 | 5757 | prothymosin, alpha | NA | PTMA |
| ENSG00000187514 | 728026 | prothymosin alpha-like | NA | LOC728026 |
| ENSG00000172037 | 3913 | laminin subunit beta 2 | Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Laminins, composed of 3 non identical chains: laminin alpha, beta and gamma (formerly A, B1, and B2, respectively), form a cruciform structure consisting of 3 short arms, each formed by a different chain, and a long arm composed of all 3 chains. Each laminin chain is a multidomain protein encoded by a distinct gene. Several isoforms of each chain have been described. Different alpha, beta and gamma chain isomers combine to give rise to different heterotrimeric laminin isoforms which are designated by Arabic numerals in the order of their discovery, i.e. alpha1beta1gamma1 heterotrimer is laminin 1. The biological functions of the different chains and trimer molecules are largely unknown, but some of the chains have been shown to differ with respect to their tissue distribution, presumably reflecting diverse functions in vivo. This gene encodes the beta chain isoform laminin, beta 2. The beta 2 chain contains the 7 structural domains typical of beta chains of laminin, including the short alpha region. However, unlike beta 1 chain, beta 2 has a more restricted tissue distribution. It is enriched in the basement membrane of muscles at the neuromuscular junctions, kidney glomerulus and vascular smooth muscle. Transgenic mice in which the beta 2 chain gene was inactivated by homologous recombination, showed defects in the maturation of neuromuscular junctions and impairment of glomerular filtration. Alternative splicing involving a non consensus 5’ splice site (gc) in the 5’ UTR of this gene has been reported. It was suggested that inefficient splicing of this first intron, which does not change the protein sequence, results in a greater abundance of the unspliced form of the transcript than the spliced form. The full-length nature of the spliced transcript is not known. | LAMB2 |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",9,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[10,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | query | summary | name | X_id | notfound |
|---|---|---|---|---|---|
| TG | ENSG00000042832 | Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. | thyroglobulin | 7038 | NA |
| TPO | ENSG00000115705 | This gene encodes a membrane-bound glycoprotein. The encoded protein acts as an enzyme and plays a central role in thyroid gland function. The protein functions in the iodination of tyrosine residues in thyroglobulin and phenoxy-ester formation between pairs of iodinated tyrosines to generate the thyroid hormones, thyroxine and triiodothyronine. Mutations in this gene are associated with several disorders of thyroid hormonogenesis, including congenital hypothyroidism, congenital goiter, and thyroid hormone organification defect IIA. Multiple transcript variants encoding distinct isoforms have been identified for this gene, but the full-length nature of some variants has not been determined. | thyroid peroxidase | 7173 | NA |
| DES | ENSG00000175084 | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | desmin | 1674 | NA |
| PAX8 | ENSG00000125618 | This gene encodes a member of the paired box (PAX) family of transcription factors. Members of this gene family typically encode proteins that contain a paired box domain, an octapeptide, and a paired-type homeodomain. This nuclear protein is involved in thyroid follicular cell development and expression of thyroid-specific genes. Mutations in this gene have been associated with thyroid dysgenesis, thyroid follicular carcinomas and atypical follicular thyroid adenomas. Alternatively spliced transcript variants encoding different isoforms have been described. | paired box 8 | 7849 | NA |
| CLU | ENSG00000120885 | The protein encoded by this gene is a secreted chaperone that can under some stress conditions also be found in the cell cytosol. It has been suggested to be involved in several basic biological events such as cell death, tumor progression, and neurodegenerative disorders. Alternate splicing results in both coding and non-coding variants. | clusterin | 1191 | NA |
| NA | ENSG00000090920 | NA | NA | NA | TRUE |
| RAP1GAP | ENSG00000076864 | This gene encodes a type of GTPase-activating-protein (GAP) that down-regulates the activity of the ras-related RAP1 protein. RAP1 acts as a molecular switch by cycling between an inactive GDP-bound form and an active GTP-bound form. The product of this gene, RAP1GAP, promotes the hydrolysis of bound GTP and hence returns RAP1 to the inactive state whereas other proteins, guanine nucleotide exchange factors (GEFs), act as RAP1 activators by facilitating the conversion of RAP1 from the GDP- to the GTP-bound form. In general, ras subfamily proteins, such as RAP1, play key roles in receptor-linked signaling pathways that control cell growth and differentiation. RAP1 plays a role in diverse processes such as cell proliferation, adhesion, differentiation, and embryogenesis. Alternative splicing results in multiple transcript variants encoding distinct proteins. | RAP1 GTPase activating protein | 5909 | NA |
| MYH11 | ENSG00000133392 | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | myosin, heavy chain 11, smooth muscle | 4629 | NA |
| FN1 | ENSG00000115414 | This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | fibronectin 1 | 2335 | NA |
| AHNAK | ENSG00000124942 | NA | AHNAK nucleoprotein | 79026 | NA |
| NEAT1 | ENSG00000245532 | This gene produces a long non-coding RNA (lncRNA) transcribed from the multiple endocrine neoplasia locus. This lncRNA is retained in the nucleus where it forms the core structural component of the paraspeckle sub-organelles. It may act as a transcriptional regulator for numerous genes, including some genes involved in cancer progression. | nuclear paraspeckle assembly transcript 1 (non-protein coding) | 283131 | NA |
| TPT1 | ENSG00000133112 | NA | tumor protein, translationally-controlled 1 | 7178 | NA |
| GAPDH | ENSG00000111640 | This gene encodes a member of the glyceraldehyde-3-phosphate dehydrogenase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. The product of this gene catalyzes an important energy-yielding step in carbohydrate metabolism, the reversible oxidative phosphorylation of glyceraldehyde-3-phosphate in the presence of inorganic phosphate and nicotinamide adenine dinucleotide (NAD). The encoded protein has additionally been identified to have uracil DNA glycosylase activity in the nucleus. Also, this protein contains a peptide that has antimicrobial activity against E. coli, P. aeruginosa, and C. albicans. Studies of a similar protein in mouse have assigned a variety of additional functions including nitrosylation of nuclear proteins, the regulation of mRNA stability, and acting as a transferrin receptor on the cell surface of macrophage. Many pseudogenes similar to this locus are present in the human genome. Alternative splicing results in multiple transcript variants. | glyceraldehyde-3-phosphate dehydrogenase | 2597 | NA |
| HSP90B1 | ENSG00000166598 | This gene encodes a member of a family of adenosine triphosphate(ATP)-metabolizing molecular chaperones with roles in stabilizing and folding other proteins. The encoded protein is localized to melanosomes and the endoplasmic reticulum. Expression of this protein is associated with a variety of pathogenic states, including tumor formation. There is a microRNA gene located within the 5’ exon of this gene. There are pseudogenes for this gene on chromosomes 1 and 15. | heat shock protein 90kDa beta family member 1 | 7184 | NA |
| ALDOA | ENSG00000149925 | The protein encoded by this gene, Aldolase A (fructose-bisphosphate aldolase), is a glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Three aldolase isozymes (A, B, and C), encoded by three different genes, are differentially expressed during development. Aldolase A is found in the developing embryo and is produced in even greater amounts in adult muscle. Aldolase A expression is repressed in adult liver, kidney and intestine and similar to aldolase C levels in brain and other nervous tissue. Aldolase A deficiency has been associated with myopathy and hemolytic anemia. Alternative splicing and alternative promoter usage results in multiple transcript variants. Related pseudogenes have been identified on chromosomes 3 and 10. | aldolase, fructose-bisphosphate A | 226 | NA |
| LIPG | ENSG00000101670 | The protein encoded by this gene has substantial phospholipase activity and may be involved in lipoprotein metabolism and vascular biology. This protein is designated a member of the TG lipase family by its sequence and characteristic lid region which provides substrate specificity for enzymes of the TG lipase family. | lipase G, endothelial type | 9388 | NA |
| GNAS | ENSG00000087460 | This locus has a highly complex imprinted expression pattern. It gives rise to maternally, paternally, and biallelically expressed transcripts that are derived from four alternative promoters and 5’ exons. Some transcripts contain a differentially methylated region (DMR) at their 5’ exons, and this DMR is commonly found in imprinted genes and correlates with transcript expression. An antisense transcript is produced from an overlapping locus on the opposite strand. One of the transcripts produced from this locus, and the antisense transcript, are paternally expressed noncoding RNAs, and may regulate imprinting in this region. In addition, one of the transcripts contains a second overlapping ORF, which encodes a structurally unrelated protein - Alex. Alternative splicing of downstream exons is also observed, which results in different forms of the stimulatory G-protein alpha subunit, a key element of the classical signal transduction pathway linking receptor-ligand interactions with the activation of adenylyl cyclase and a variety of cellular reponses. Multiple transcript variants encoding different isoforms have been found for this gene. Mutations in this gene result in pseudohypoparathyroidism type 1a, pseudohypoparathyroidism type 1b, Albright hereditary osteodystrophy, pseudopseudohypoparathyroidism, McCune-Albright syndrome, progressive osseus heteroplasia, polyostotic fibrous dysplasia of bone, and some pituitary tumors. | GNAS complex locus | 2778 | NA |
| CTSB | ENSG00000164733 | This gene encodes a member of the C1 family of peptidases. Alternative splicing of this gene results in multiple transcript variants. At least one of these variants encodes a preproprotein that is proteolytically processed to generate multiple protein products. These products include the cathepsin B light and heavy chains, which can dimerize to form the double chain form of the enzyme. This enzyme is a lysosomal cysteine protease with both endopeptidase and exopeptidase activity that may play a role in protein turnover. It is also known as amyloid precursor protein secretase and is involved in the proteolytic processing of amyloid precursor protein (APP). Incomplete proteolytic processing of APP has been suggested to be a causative factor in Alzheimer’s disease, the most common cause of dementia. Overexpression of the encoded protein has been associated with esophageal adenocarcinoma and other tumors. Multiple pseudogenes of this gene have been identified. | cathepsin B | 1508 | NA |
| SORD | ENSG00000140263 | Sorbitol dehydrogenase (SORD; EC 1.1.1.14) catalyzes the interconversion of polyols and their corresponding ketoses, and together with aldose reductase (ALDR1; MIM 103880), makes up the sorbitol pathway that is believed to play an important role in the development of diabetic complications (summarized by Carr and Markham, 1995 [PubMed 8535074]). The first reaction of the pathway (also called the polyol pathway) is the reduction of glucose to sorbitol by ALDR1 with NADPH as the cofactor. SORD then oxidizes the sorbitol to fructose using NAD(+) cofactor. | sorbitol dehydrogenase | 6652 | NA |
| ANXA1 | ENSG00000135046 | This gene encodes a membrane-localized protein that binds phospholipids. This protein inhibits phospholipase A2 and has anti-inflammatory activity. Loss of function or expression of this gene has been detected in multiple tumors. | annexin A1 | 301 | NA |
| APP | ENSG00000142192 | This gene encodes a cell surface receptor and transmembrane precursor protein that is cleaved by secretases to form a number of peptides. Some of these peptides are secreted and can bind to the acetyltransferase complex APBB1/TIP60 to promote transcriptional activation, while others form the protein basis of the amyloid plaques found in the brains of patients with Alzheimer disease. In addition, two of the peptides are antimicrobial peptides, having been shown to have bacteriocidal and antifungal activities. Mutations in this gene have been implicated in autosomal dominant Alzheimer disease and cerebroarterial amyloidosis (cerebral amyloid angiopathy). Multiple transcript variants encoding several different isoforms have been found for this gene. | amyloid beta precursor protein | 351 | NA |
| CALR | ENSG00000179218 | Calreticulin is a multifunctional protein that acts as a major Ca(2+)-binding (storage) protein in the lumen of the endoplasmic reticulum. It is also found in the nucleus, suggesting that it may have a role in transcription regulation. Calreticulin binds to the synthetic peptide KLGFFKR, which is almost identical to an amino acid sequence in the DNA-binding domain of the superfamily of nuclear receptors. Calreticulin binds to antibodies in certain sera of systemic lupus and Sjogren patients which contain anti-Ro/SSA antibodies, it is highly conserved among species, and it is located in the endoplasmic and sarcoplasmic reticulum where it may bind calcium. The amino terminus of calreticulin interacts with the DNA-binding domain of the glucocorticoid receptor and prevents the receptor from binding to its specific glucocorticoid response element. Calreticulin can inhibit the binding of androgen receptor to its hormone-responsive DNA element and can inhibit androgen receptor and retinoic acid receptor transcriptional activities in vivo, as well as retinoic acid-induced neuronal differentiation. Thus, calreticulin can act as an important modulator of the regulation of gene transcription by nuclear hormone receptors. Systemic lupus erythematosus is associated with increased autoantibody titers against calreticulin but calreticulin is not a Ro/SS-A antigen. Earlier papers referred to calreticulin as an Ro/SS-A antigen but this was later disproven. Increased autoantibody titer against human calreticulin is found in infants with complete congenital heart block of both the IgG and IgM classes. | calreticulin | 811 | NA |
| EPCAM | ENSG00000119888 | This gene encodes a carcinoma-associated antigen and is a member of a family that includes at least two type I membrane proteins. This antigen is expressed on most normal epithelial cells and gastrointestinal carcinomas and functions as a homotypic calcium-independent cell adhesion molecule. The antigen is being used as a target for immunotherapy treatment of human carcinomas. Mutations in this gene result in congenital tufting enteropathy. | epithelial cell adhesion molecule | 4072 | NA |
| PLEKHH1 | ENSG00000054690 | NA | pleckstrin homology, MyTH4 and FERM domain containing H1 | 57475 | NA |
| TFF3 | ENSG00000160180 | Members of the trefoil family are characterized by having at least one copy of the trefoil motif, a 40-amino acid domain that contains three conserved disulfides. They are stable secretory proteins expressed in gastrointestinal mucosa. Their functions are not defined, but they may protect the mucosa from insults, stabilize the mucus layer and affect healing of the epithelium. This gene is expressed in goblet cells of the intestines and colon. This gene and two other related trefoil family member genes are found in a cluster on chromosome 21. | trefoil factor 3 | 7033 | NA |
| TPM3 | ENSG00000143549 | This gene encodes a member of the tropomyosin family of actin-binding proteins. Tropomyosins are dimers of coiled-coil proteins that provide stability to actin filaments and regulate access of other actin-binding proteins. Mutations in this gene result in autosomal dominant nemaline myopathy and other muscle disorders. This locus is involved in translocations with other loci, including anaplastic lymphoma receptor tyrosine kinase (ALK) and neurotrophic tyrosine kinase receptor type 1 (NTRK1), which result in the formation of fusion proteins that act as oncogenes. There are numerous pseudogenes for this gene on different chromosomes. Alternative splicing results in multiple transcript variants. | tropomyosin 3 | 7170 | NA |
| RGL3 | ENSG00000205517 | NA | ral guanine nucleotide dissociation stimulator like 3 | 57139 | NA |
| TAGLN | ENSG00000149591 | The protein encoded by this gene is a transformation and shape-change sensitive actin cross-linking/gelling protein found in fibroblasts and smooth muscle. Its expression is down-regulated in many cell lines, and this down-regulation may be an early and sensitive marker for the onset of transformation. A functional role of this protein is unclear. Two transcript variants encoding the same protein have been found for this gene. | transgelin | 6876 | NA |
| NPNT | ENSG00000168743 | NA | nephronectin | 255743 | NA |
| ADGRG1 | ENSG00000205336 | This gene encodes a member of the G protein-coupled receptor family and regulates brain cortical patterning. The encoded protein binds specifically to transglutaminase 2, a component of tissue and tumor stroma implicated as an inhibitor of tumor progression. Mutations in this gene are associated with a brain malformation known as bilateral frontoparietal polymicrogyria. Alternative splicing results in multiple transcript variants. | adhesion G protein-coupled receptor G1 | 9289 | NA |
| GOLGA8B | ENSG00000215252 | NA | golgin A8 family member B | 440270 | NA |
| GOLGA8A | ENSG00000215252 | The Golgi apparatus, which participates in glycosylation and transport of proteins and lipids in the secretory pathway, consists of a series of stacked, flattened membrane sacs referred to as cisternae. Interactions between the Golgi and microtubules are thought to be important for the reorganization of the Golgi after it fragments during mitosis. The golgins constitute a family of proteins which are localized to the Golgi. This gene encodes a golgin which structurally resembles its family member GOLGA2, suggesting that they may share a similar function. There are many similar copies of this gene on chromosome 15. Alternative splicing results in multiple transcript variants. | golgin A8 family member A | 23015 | NA |
| IVD | ENSG00000128928 | Isovaleryl-CoA dehydrogenase (IVD) is a mitochondrial matrix enzyme that catalyzes the third step in leucine catabolism. The genetic deficiency of IVD results in an accumulation of isovaleric acid, which is toxic to the central nervous system and leads to isovaleric acidemia. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | isovaleryl-CoA dehydrogenase | 3712 | NA |
| MTCH1 | ENSG00000137409 | This gene encodes a member of the mitochondrial carrier family. The encoded protein is localized to the mitochondrion inner membrane and induces apoptosis independent of the proapoptotic proteins Bax and Bak. Pseudogenes on chromosomes 6 and 11 have been identified for this gene. Alternatively spliced transcript variants encoding multiple isoforms have been observed. | mitochondrial carrier 1 | 23787 | NA |
| APOE | ENSG00000130203 | The protein encoded by this gene is a major apoprotein of the chylomicron. It binds to a specific liver and peripheral cell receptor, and is essential for the normal catabolism of triglyceride-rich lipoprotein constituents. This gene maps to chromosome 19 in a cluster with the related apolipoprotein C1 and C2 genes. Mutations in this gene result in familial dysbetalipoproteinemia, or type III hyperlipoproteinemia (HLP III), in which increased plasma cholesterol and triglycerides are the consequence of impaired clearance of chylomicron and VLDL remnants. Alternative splicing results in multiple transcript variants. | apolipoprotein E | 348 | NA |
| PEBP1 | ENSG00000089220 | This gene encodes a member of the phosphatidylethanolamine-binding family of proteins and has been shown to modulate multiple signaling pathways, including the MAP kinase (MAPK), NF-kappa B, and glycogen synthase kinase-3 (GSK-3) signaling pathways. The encoded protein can be further processed to form a smaller cleavage product, hippocampal cholinergic neurostimulating peptide (HCNP), which may be involved in neural development. This gene has been implicated in numerous human cancers and may act as a metastasis suppressor gene. Multiple pseudogenes of this gene have been identified in the genome. | phosphatidylethanolamine binding protein 1 | 5037 | NA |
| GFAP | ENSG00000131095 | This gene encodes one of the major intermediate filament proteins of mature astrocytes. It is used as a marker to distinguish astrocytes from other glial cells during development. Mutations in this gene cause Alexander disease, a rare disorder of astrocytes in the central nervous system. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | glial fibrillary acidic protein | 2670 | NA |
| SYNPO2 | ENSG00000172403 | NA | synaptopodin 2 | 171024 | NA |
| INPP5J | ENSG00000185133 | NA | inositol polyphosphate-5-phosphatase J | 27124 | NA |
| AGRN | ENSG00000188157 | This gene encodes one of several proteins that are critical in the development of the neuromuscular junction (NMJ), as identified in mouse knock-out studies. The encoded protein contains several laminin G, Kazal type serine protease inhibitor, and epidermal growth factor domains. Additional post-translational modifications occur to add glycosaminoglycans and disulfide bonds. In one family with congenital myasthenic syndrome affecting limb-girdle muscles, a mutation in this gene was found. Alternative splicing results in multiple transcript variants encoding different isoforms. | agrin | 375790 | NA |
| MKNK2 | ENSG00000099875 | This gene encodes a member of the calcium/calmodulin-dependent protein kinases (CAMK) Ser/Thr protein kinase family, which belongs to the protein kinase superfamily. This protein contains conserved DLG (asp-leu-gly) and ENIL (glu-asn-ile-leu) motifs, and an N-terminal polybasic region which binds importin A and the translation factor scaffold protein eukaryotic initiation factor 4G (eIF4G). This protein is one of the downstream kinases activated by mitogen-activated protein (MAP) kinases. It phosphorylates the eukaryotic initiation factor 4E (eIF4E), thus playing important roles in the initiation of mRNA translation, oncogenic transformation and malignant cell proliferation. In addition to eIF4E, this protein also interacts with von Hippel-Lindau tumor suppressor (VHL), ring-box 1 (Rbx1) and Cullin2 (Cul2), which are all components of the CBC(VHL) ubiquitin ligase E3 complex. Multiple alternatively spliced transcript variants have been found, but the full-length nature and biological activity of only two variants are determined. These two variants encode distinct isoforms which differ in activity and regulation, and in subcellular localization. | MAP kinase interacting serine/threonine kinase 2 | 2872 | NA |
| ATP8A1 | ENSG00000124406 | The P-type adenosinetriphosphatases (P-type ATPases) are a family of proteins which use the free energy of ATP hydrolysis to drive uphill transport of ions across membranes. Several subfamilies of P-type ATPases have been identified. One subfamily catalyzes transport of heavy metal ions. Another subfamily transports non-heavy metal ions (NMHI). The protein encoded by this gene is a member of the third subfamily of P-type ATPases and acts to transport amphipaths, such as phosphatidylserine. Two transcript variants encoding different isoforms have been found for this gene. | ATPase phospholipid transporting 8A1 | 10396 | NA |
| S100A9 | ENSG00000163220 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and altered expression of this protein is associated with the disease cystic fibrosis. This antimicrobial protein exhibits antifungal and antibacterial activity. | S100 calcium binding protein A9 | 6280 | NA |
| MT1F | ENSG00000198417 | NA | metallothionein 1F | 4494 | NA |
| KRT8 | ENSG00000170421 | This gene is a member of the type II keratin family clustered on the long arm of chromosome 12. Type I and type II keratins heteropolymerize to form intermediate-sized filaments in the cytoplasm of epithelial cells. The product of this gene typically dimerizes with keratin 18 to form an intermediate filament in simple single-layered epithelial cells. This protein plays a role in maintaining cellular structural integrity and also functions in signal transduction and cellular differentiation. Mutations in this gene cause cryptogenic cirrhosis. Alternatively spliced transcript variants have been found for this gene. | keratin 8 | 3856 | NA |
| FBXL16 | ENSG00000127585 | Members of the F-box protein family, such as FBXL16, are characterized by an approximately 40-amino acid F-box motif. SCF complexes, formed by SKP1 (MIM 601434), cullin (see CUL1; MIM 603134), and F-box proteins, act as protein-ubiquitin ligases. F-box proteins interact with SKP1 through the F box, and they interact with ubiquitination targets through other protein interaction domains (Jin et al., 2004 [PubMed 15520277]). | F-box and leucine rich repeat protein 16 | 146330 | NA |
| AFAP1L2 | ENSG00000169129 | NA | actin filament associated protein 1 like 2 | 84632 | NA |
| RASSF4 | ENSG00000107551 | The function of this gene has not yet been determined but may involve a role in tumor suppression. Alternative splicing of this gene results in several transcript variants; however, most of the variants have not been fully described. | Ras association domain family member 4 | 83937 | NA |
| SELENBP1 | ENSG00000143416 | This gene encodes a member of the selenium-binding protein family. Selenium is an essential nutrient that exhibits potent anticarcinogenic properties, and deficiency of selenium may cause certain neurologic diseases. The effects of selenium in preventing cancer and neurologic diseases may be mediated by selenium-binding proteins, and decreased expression of this gene may be associated with several types of cancer. The encoded protein may play a selenium-dependent role in ubiquitination/deubiquitination-mediated protein degradation. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | selenium binding protein 1 | 8991 | NA |
| TPM1 | ENSG00000140416 | This gene is a member of the tropomyosin family of highly conserved, widely distributed actin-binding proteins involved in the contractile system of striated and smooth muscles and the cytoskeleton of non-muscle cells. Tropomyosin is composed of two alpha-helical chains arranged as a coiled-coil. It is polymerized end to end along the two grooves of actin filaments and provides stability to the filaments. The encoded protein is one type of alpha helical chain that forms the predominant tropomyosin of striated muscle, where it also functions in association with the troponin complex to regulate the calcium-dependent interaction of actin and myosin during muscle contraction. In smooth muscle and non-muscle cells, alternatively spliced transcript variants encoding a range of isoforms have been described. Mutations in this gene are associated with type 3 familial hypertrophic cardiomyopathy. | tropomyosin 1 (alpha) | 7168 | NA |
| MT1G | ENSG00000125144 | NA | metallothionein 1G | 4495 | NA |
| LOC100129518 | ENSG00000112096 | NA | uncharacterized LOC100129518 | 100129518 | NA |
| SOD2 | ENSG00000112096 | This gene is a member of the iron/manganese superoxide dismutase family. It encodes a mitochondrial protein that forms a homotetramer and binds one manganese ion per subunit. This protein binds to the superoxide byproducts of oxidative phosphorylation and converts them to hydrogen peroxide and diatomic oxygen. Mutations in this gene have been associated with idiopathic cardiomyopathy (IDC), premature aging, sporadic motor neuron disease, and cancer. Alternative splicing of this gene results in multiple transcript variants. A related pseudogene has been identified on chromosome 1. | superoxide dismutase 2, mitochondrial | 6648 | NA |
| HSPB6 | ENSG00000004776 | This locus encodes a heat shock protein. The encoded protein likely plays a role in smooth muscle relaxation. | heat shock protein family B (small) member 6 | 126393 | NA |
| KIF5A | ENSG00000155980 | This gene encodes a member of the kinesin family of proteins. Members of this family are part of a multisubunit complex that functions as a microtubule motor in intracellular organelle transport. Mutations in this gene cause autosomal dominant spastic paraplegia 10. | kinesin family member 5A | 3798 | NA |
| H19 | ENSG00000130600 | This gene is located in an imprinted region of chromosome 11 near the insulin-like growth factor 2 (IGF2) gene. This gene is only expressed from the maternally-inherited chromosome, whereas IGF2 is only expressed from the paternally-inherited chromosome. The product of this gene is a long non-coding RNA which functions as a tumor suppressor. Mutations in this gene have been associated with Beckwith-Wiedemann Syndrome and Wilms tumorigenesis. Alternative splicing results in multiple transcript variants. | H19, imprinted maternally expressed transcript (non-protein coding) | 283120 | NA |
| EPOR | ENSG00000187266 | This gene encodes the erythropoietin receptor which is a member of the cytokine receptor family. Upon erythropoietin binding, this receptor activates Jak2 tyrosine kinase which activates different intracellular pathways including: Ras/MAP kinase, phosphatidylinositol 3-kinase and STAT transcription factors. The stimulated erythropoietin receptor appears to have a role in erythroid cell survival. Defects in the erythropoietin receptor may produce erythroleukemia and familial erythrocytosis. Dysregulation of this gene may affect the growth of certain tumors. Alternate splicing results in multiple transcript variants. | erythropoietin receptor | 2057 | NA |
| SLC4A11 | ENSG00000088836 | This gene encodes a voltage-regulated, electrogenic sodium-coupled borate cotransporter that is essential for borate homeostasis, cell growth and cell proliferation. Mutations in this gene have been associated with a number of endothelial corneal dystrophies including recessive corneal endothelial dystrophy 2, corneal dystrophy and perceptive deafness, and Fuchs endothelial corneal dystrophy. Multiple transcript variants encoding different isoforms have been described. | solute carrier family 4 member 11 | 83959 | NA |
| RHPN1 | ENSG00000158106 | NA | rhophilin, Rho GTPase binding protein 1 | 114822 | NA |
| MARCKSL1 | ENSG00000175130 | This gene encodes a member of the myristoylated alanine-rich C-kinase substrate (MARCKS) family. Members of this family play a role in cytoskeletal regulation, protein kinase C signaling and calmodulin signaling. The encoded protein affects the formation of adherens junction. Alternative splicing results in multiple transcript variants. Pseudogenes of this gene are located on the long arm of chromosomes 6 and 10. | MARCKS like 1 | 65108 | NA |
| COL23A1 | ENSG00000050767 | COL23A1 is a member of the transmembrane collagens, a subfamily of the nonfibrillar collagens that contain a single pass hydrophobic transmembrane domain (Banyard et al., 2003 [PubMed 12644459]). | collagen type XXIII alpha 1 chain | 91522 | NA |
| PLVAP | ENSG00000130300 | NA | plasmalemma vesicle associated protein | 83483 | NA |
| ITM2C | ENSG00000135916 | NA | integral membrane protein 2C | 81618 | NA |
| FAM129A | ENSG00000135842 | NA | family with sequence similarity 129 member A | 116496 | NA |
| COL1A1 | ENSG00000108821 | This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIA, Ehlers-Danlos syndrome Classical type, Caffey Disease and idiopathic osteoporosis. Reciprocal translocations between chromosomes 17 and 22, where this gene and the gene for platelet-derived growth factor beta are located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans, resulting from unregulated expression of the growth factor. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | collagen type I alpha 1 | 1277 | NA |
| FOSL2 | ENSG00000075426 | The Fos gene family consists of 4 members: FOS, FOSB, FOSL1, and FOSL2. These genes encode leucine zipper proteins that can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1. As such, the FOS proteins have been implicated as regulators of cell proliferation, differentiation, and transformation. | FOS like 2, AP-1 transcription factor subunit | 2355 | NA |
| PARM1 | ENSG00000169116 | NA | prostate androgen-regulated mucin-like protein 1 | 25849 | NA |
| CCL21 | ENSG00000137077 | This antimicrobial gene is one of several CC cytokine genes clustered on the p-arm of chromosome 9. Cytokines are a family of secreted proteins involved in immunoregulatory and inflammatory processes. The CC cytokines are proteins characterized by two adjacent cysteines. Similar to other chemokines the protein encoded by this gene inhibits hemopoiesis and stimulates chemotaxis. This protein is chemotactic in vitro for thymocytes and activated T cells, but not for B cells, macrophages, or neutrophils. The cytokine encoded by this gene may also play a role in mediating homing of lymphocytes to secondary lymphoid organs. It is a high affinity functional ligand for chemokine receptor 7 that is expressed on T and B lymphocytes and a known receptor for another member of the cytokine family (small inducible cytokine A19). | C-C motif chemokine ligand 21 | 6366 | NA |
| MYL9 | ENSG00000101335 | Myosin, a structural component of muscle, consists of two heavy chains and four light chains. The protein encoded by this gene is a myosin light chain that may regulate muscle contraction by modulating the ATPase activity of myosin heads. The encoded protein binds calcium and is activated by myosin light chain kinase. Two transcript variants encoding different isoforms have been found for this gene. | myosin light chain 9 | 10398 | NA |
| ITPR3 | ENSG00000096433 | This gene encodes a receptor for inositol 1,4,5-trisphosphate, a second messenger that mediates the release of intracellular calcium. The receptor contains a calcium channel at the C-terminus and the ligand-binding site at the N-terminus. Knockout studies in mice suggest that type 2 and type 3 inositol 1,4,5-trisphosphate receptors play a key role in exocrine secretion underlying energy metabolism and growth. | inositol 1,4,5-trisphosphate receptor type 3 | 3710 | NA |
| SDC2 | ENSG00000169439 | The protein encoded by this gene is a transmembrane (type I) heparan sulfate proteoglycan and is a member of the syndecan proteoglycan family. The syndecans mediate cell binding, cell signaling, and cytoskeletal organization and syndecan receptors are required for internalization of the HIV-1 tat protein. The syndecan-2 protein functions as an integral membrane protein and participates in cell proliferation, cell migration and cell-matrix interactions via its receptor for extracellular matrix proteins. Altered syndecan-2 expression has been detected in several different tumor types. | syndecan 2 | 6383 | NA |
| STMN3 | ENSG00000197457 | This gene encodes a protein which is a member of the stathmin protein family. Members of this protein family form a complex with tubulins at a ratio of 2 tubulins for each stathmin protein. Microtubules require the ordered assembly of alpha- and beta-tubulins, and formation of a complex with stathmin disrupts microtubule formation and function. A pseudogene of this gene is located on chromosome 22. Alternative splicing results in multiple transcript variants. | stathmin 3 | 50861 | NA |
| PDK4 | ENSG00000004799 | This gene is a member of the PDK/BCKDK protein kinase family and encodes a mitochondrial protein with a histidine kinase domain. This protein is located in the matrix of the mitrochondria and inhibits the pyruvate dehydrogenase complex by phosphorylating one of its subunits, thereby contributing to the regulation of glucose metabolism. Expression of this gene is regulated by glucocorticoids, retinoic acid and insulin. | pyruvate dehydrogenase kinase 4 | 5166 | NA |
| B2M | ENSG00000166710 | This gene encodes a serum protein found in association with the major histocompatibility complex (MHC) class I heavy chain on the surface of nearly all nucleated cells. The protein has a predominantly beta-pleated sheet structure that can form amyloid fibrils in some pathological conditions. The encoded antimicrobial protein displays antibacterial activity in amniotic fluid. A mutation in this gene has been shown to result in hypercatabolic hypoproteinemia. | beta-2-microglobulin | 567 | NA |
| MCAM | ENSG00000076706 | NA | melanoma cell adhesion molecule | 4162 | NA |
| COL3A1 | ENSG00000168542 | This gene encodes the pro-alpha1 chains of type III collagen, a fibrillar collagen that is found in extensible connective tissues such as skin, lung, uterus, intestine and the vascular system, frequently in association with type I collagen. Mutations in this gene are associated with Ehlers-Danlos syndrome types IV, and with aortic and arterial aneurysms. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | collagen type III alpha 1 chain | 1281 | NA |
| FAM107A | ENSG00000168309 | NA | family with sequence similarity 107 member A | 11170 | NA |
| TLN1 | ENSG00000137076 | This gene encodes a cytoskeletal protein that is concentrated in areas of cell-substratum and cell-cell contacts. The encoded protein plays a significant role in the assembly of actin filaments and in spreading and migration of various cell types, including fibroblasts and osteoclasts. It codistributes with integrins in the cell surface membrane in order to assist in the attachment of adherent cells to extracellular matrices and of lymphocytes to other cells. The N-terminus of this protein contains elements for localization to cell-extracellular matrix junctions. The C-terminus contains binding sites for proteins such as beta-1-integrin, actin, and vinculin. | talin 1 | 7094 | NA |
| VEGFA | ENSG00000112715 | This gene is a member of the PDGF/VEGF growth factor family. It encodes a heparin-binding protein, which exists as a disulfide-linked homodimer. This growth factor induces proliferation and migration of vascular endothelial cells, and is essential for both physiological and pathological angiogenesis. Disruption of this gene in mice resulted in abnormal embryonic blood vessel formation. This gene is upregulated in many known tumors and its expression is correlated with tumor stage and progression. Elevated levels of this protein are found in patients with POEMS syndrome, also known as Crow-Fukase syndrome. Allelic variants of this gene have been associated with microvascular complications of diabetes 1 (MVCD1) and atherosclerosis. Alternatively spliced transcript variants encoding different isoforms have been described. There is also evidence for alternative translation initiation from upstream non-AUG (CUG) codons resulting in additional isoforms. A recent study showed that a C-terminally extended isoform is produced by use of an alternative in-frame translation termination codon via a stop codon readthrough mechanism, and that this isoform is antiangiogenic. Expression of some isoforms derived from the AUG start codon is regulated by a small upstream open reading frame, which is located within an internal ribosome entry site. | vascular endothelial growth factor A | 7422 | NA |
| ID4 | ENSG00000172201 | This gene encodes a member of the inhibitor of DNA binding (ID) protein family. These proteins are basic helix-loop-helix transcription factors which can act as tumor suppressors but lack DNA binding activity. Consequently, the activity of the encoded protein depends on the protein binding partner. | inhibitor of DNA binding 4, HLH protein | 3400 | NA |
| PCP4 | ENSG00000183036 | NA | Purkinje cell protein 4 | 5121 | NA |
| ARAP2 | ENSG00000047365 | The protein encoded by this gene contains ARF-GAP, RHO-GAP, ankyrin repeat, RAS-associating, and pleckstrin homology domains. The protein is a phosphatidylinositol (3,4,5)-trisphosphate-dependent Arf6 GAP that binds RhoA-GTP, but it lacks the predicted catalytic arginine in the RHO-GAP domain and does not have RHO-GAP activity. The protein associates with focal adhesions and functions downstream of RhoA to regulate focal adhesion dynamics. | ArfGAP with RhoGAP domain, ankyrin repeat and PH domain 2 | 116984 | NA |
| ECM1 | ENSG00000143369 | This gene encodes a soluble protein that is involved in endochondral bone formation, angiogenesis, and tumor biology. It also interacts with a variety of extracellular and structural proteins, contributing to the maintenance of skin integrity and homeostasis. Mutations in this gene are associated with lipoid proteinosis disorder (also known as hyalinosis cutis et mucosae or Urbach-Wiethe disease) that is characterized by generalized thickening of skin, mucosae and certain viscera. Alternatively spliced transcript variants encoding distinct isoforms have been described for this gene. | extracellular matrix protein 1 | 1893 | NA |
| COL1A2 | ENSG00000164692 | This gene encodes the pro-alpha2 chain of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIB, recessive Ehlers-Danlos syndrome Classical type, idiopathic osteoporosis, and atypical Marfan syndrome. Symptoms associated with mutations in this gene, however, tend to be less severe than mutations in the gene for the alpha1 chain of type I collagen (COL1A1) reflecting the different role of alpha2 chains in matrix integrity. Three transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | collagen type I alpha 2 chain | 1278 | NA |
| RP11-290D2.6 | ENSG00000273149 | NA | NA | ENSG00000273149 | NA |
| SNX1 | ENSG00000028528 | This gene encodes a member of the sorting nexin family. Members of this family contain a phox (PX) domain, which is a phosphoinositide binding domain, and are involved in intracellular trafficking. This endosomal protein regulates the cell-surface expression of epidermal growth factor receptor. This protein also has a role in sorting protease-activated receptor-1 from early endosomes to lysosomes. This protein may form oligomeric complexes with family members. This gene results in three transcript variants encoding distinct isoforms. | sorting nexin 1 | 6642 | NA |
| CMTM4 | ENSG00000183723 | This gene belongs to the chemokine-like factor gene superfamily, a novel family that is similar to the chemokine and the transmembrane 4 superfamilies of signaling molecules. This gene is one of several chemokine-like factor genes located in a cluster on chromosome 16. Alternatively spliced transcript variants encoding different isoforms have been identified. | CKLF like MARVEL transmembrane domain containing 4 | 146223 | NA |
| DCN | ENSG00000011465 | This gene encodes a member of the small leucine-rich proteoglycan family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature protein. This protein plays a role in collagen fibril assembly. Binding of this protein to multiple cell surface receptors mediates its role in tumor suppression, including a stimulatory effect on autophagy and inflammation and an inhibitory effect on angiogenesis and tumorigenesis. This gene and the related gene biglycan are thought to be the result of a gene duplication. Mutations in this gene are associated with congenital stromal corneal dystrophy in human patients. | decorin | 1634 | NA |
| COL4A2 | ENSG00000134871 | This gene encodes one of the six subunits of type IV collagen, the major structural component of basement membranes. The C-terminal portion of the protein, known as canstatin, is an inhibitor of angiogenesis and tumor growth. Like the other members of the type IV collagen gene family, this gene is organized in a head-to-head conformation with another type IV collagen gene so that each gene pair shares a common promoter. | collagen type IV alpha 2 | 1284 | NA |
| ACTA2 | ENSG00000107796 | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | actin, alpha 2, smooth muscle, aorta | 59 | NA |
| MYH7 | ENSG00000092054 | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | myosin, heavy chain 7, cardiac muscle, beta | 4625 | NA |
| MBP | ENSG00000197971 | The protein encoded by the classic MBP gene is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. However, MBP-related transcripts are also present in the bone marrow and the immune system. These mRNAs arise from the long MBP gene (otherwise called ‘Golli-MBP’) that contains 3 additional exons located upstream of the classic MBP exons. Alternative splicing from the Golli and the MBP transcription start sites gives rise to 2 sets of MBP-related transcripts and gene products. The Golli mRNAs contain 3 exons unique to Golli-MBP, spliced in-frame to 1 or more MBP exons. They encode hybrid proteins that have N-terminal Golli aa sequence linked to MBP aa sequence. The second family of transcripts contain only MBP exons and produce the well characterized myelin basic proteins. This complex gene structure is conserved among species suggesting that the MBP transcription unit is an integral part of the Golli transcription unit and that this arrangement is important for the function and/or regulation of these genes. | myelin basic protein | 4155 | NA |
| ACTG2 | ENSG00000163017 | Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | actin, gamma 2, smooth muscle, enteric | 72 | NA |
| SMTN | ENSG00000183963 | This gene encodes a structural protein that is found exclusively in contractile smooth muscle cells. It associates with stress fibers and constitutes part of the cytoskeleton. This gene is localized to chromosome 22q12.3, distal to the TUPLE1 locus and outside the DiGeorge syndrome deletion. Alternative splicing of this gene results in multiple transcript variants encoding distinct isoforms. | smoothelin | 6525 | NA |
| KIAA1522 | ENSG00000162522 | NA | KIAA1522 | 57648 | NA |
| WFDC2 | ENSG00000101443 | This gene encodes a protein that is a member of the WFDC domain family. The WFDC domain, or WAP Signature motif, contains eight cysteines forming four disulfide bonds at the core of the protein, and functions as a protease inhibitor in many family members. This gene is expressed in pulmonary epithelial cells, and was also found to be expressed in some ovarian cancers. The encoded protein is a small secretory protein, which may be involved in sperm maturation. | WAP four-disulfide core domain 2 | 10406 | NA |
| SORBS1 | ENSG00000095637 | This gene encodes a CBL-associated protein which functions in the signaling and stimulation of insulin. Mutations in this gene may be associated with human disorders of insulin resistance. Alternative splicing results in multiple transcript variants. | sorbin and SH3 domain containing 1 | 10580 | NA |
| PPL | ENSG00000118898 | The protein encoded by this gene is a component of desmosomes and of the epidermal cornified envelope in keratinocytes. The N-terminal domain of this protein interacts with the plasma membrane and its C-terminus interacts with intermediate filaments. Through its rod domain, this protein forms complexes with envoplakin. This protein may serve as a link between the cornified envelope and desmosomes as well as intermediate filaments. AKT1/PKB, a protein kinase mediating a variety of cell growth and survival signaling processes, is reported to interact with this protein, suggesting a possible role for this protein as a localization signal in AKT1-mediated signaling. | periplakin | 5493 | NA |
| NACA | ENSG00000196531 | This gene encodes a protein that associates with basic transcription factor 3 (BTF3) to form the nascent polypeptide-associated complex (NAC). This complex binds to nascent proteins that lack a signal peptide motif as they emerge from the ribosome, blocking interaction with the signal recognition particle (SRP) and preventing mistranslocation to the endoplasmic reticulum. This protein is an IgE autoantigen in atopic dermatitis patients. Alternative splicing results in multiple transcript variants, but the full length nature of some of these variants, including those encoding very large proteins, has not been determined. There are multiple pseudogenes of this gene on different chromosomes. | nascent polypeptide-associated complex alpha subunit | 4666 | NA |
| FARP1 | ENSG00000152767 | This gene encodes a protein containing a FERM (4.2, exrin, radixin, moesin) domain, a Dbl homology domain, and two pleckstrin homology domains. These domains are found in guanine nucleotide exchange factors and proteins that link the cytoskeleton to the cell membrane. The encoded protein functions in neurons to promote dendritic growth. Alternative splicing results in multiple transcript variants. | FERM, ARH/RhoGEF and pleckstrin domain protein 1 | 10160 | NA |
| RRBP1 | ENSG00000125844 | This gene encodes a ribosome-binding protein of the endoplasmic reticulum (ER) membrane. Studies suggest that this gene plays a role in ER proliferation, secretory pathways and secretory cell differentiation, and mediation of ER-microtubule interactions. Alternative splicing has been observed and protein isoforms are characterized by regions of N-terminal decapeptide and C-terminal heptad repeats. Splicing of the tandem repeats results in variations in ribosome-binding affinity and secretory function. The full-length nature of variants which differ in repeat length has not been determined. Pseudogenes of this gene have been identified on chromosomes 3 and 7, and RRBP1 has been excluded as a candidate gene in the cause of Alagille syndrome, the result of a mutation in a nearby gene on chromosome 20p12. | ribosome binding protein 1 | 6238 | NA |
| GOLGA8A | ENSG00000175265 | The Golgi apparatus, which participates in glycosylation and transport of proteins and lipids in the secretory pathway, consists of a series of stacked, flattened membrane sacs referred to as cisternae. Interactions between the Golgi and microtubules are thought to be important for the reorganization of the Golgi after it fragments during mitosis. The golgins constitute a family of proteins which are localized to the Golgi. This gene encodes a golgin which structurally resembles its family member GOLGA2, suggesting that they may share a similar function. There are many similar copies of this gene on chromosome 15. Alternative splicing results in multiple transcript variants. | golgin A8 family member A | 23015 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",10,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[11,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| name | X_id | summary | symbol | query | notfound |
|---|---|---|---|---|---|
| neurogranin | 4900 | Neurogranin (NRGN) is the human homolog of the neuron-specific rat RC3/neurogranin gene. This gene encodes a postsynaptic protein kinase substrate that binds calmodulin in the absence of calcium. The NRGN gene contains four exons and three introns. The exons 1 and 2 encode the protein and exons 3 and 4 contain untranslated sequences. It is suggested that the NRGN is a direct target for thyroid hormone in human brain, and that control of expression of this gene could underlay many of the consequences of hypothyroidism on mental states during development as well as in adult subjects. | NRGN | ENSG00000154146 | NA |
| kinesin family member 5A | 3798 | This gene encodes a member of the kinesin family of proteins. Members of this family are part of a multisubunit complex that functions as a microtubule motor in intracellular organelle transport. Mutations in this gene cause autosomal dominant spastic paraplegia 10. | KIF5A | ENSG00000155980 | NA |
| keratin 10 | 3858 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | KRT10 | ENSG00000186395 | NA |
| vimentin | 7431 | This gene encodes a member of the intermediate filament family. Intermediate filamentents, along with microtubules and actin microfilaments, make up the cytoskeleton. The protein encoded by this gene is responsible for maintaining cell shape, integrity of the cytoplasm, and stabilizing cytoskeletal interactions. It is also involved in the immune response, and controls the transport of low-density lipoprotein (LDL)-derived cholesterol from a lysosome to the site of esterification. It functions as an organizer of a number of critical proteins involved in attachment, migration, and cell signaling. Mutations in this gene causes a dominant, pulverulent cataract. | VIM | ENSG00000026025 | NA |
| keratin 1 | 3848 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | KRT1 | ENSG00000167768 | NA |
| glyceraldehyde-3-phosphate dehydrogenase | 2597 | This gene encodes a member of the glyceraldehyde-3-phosphate dehydrogenase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. The product of this gene catalyzes an important energy-yielding step in carbohydrate metabolism, the reversible oxidative phosphorylation of glyceraldehyde-3-phosphate in the presence of inorganic phosphate and nicotinamide adenine dinucleotide (NAD). The encoded protein has additionally been identified to have uracil DNA glycosylase activity in the nucleus. Also, this protein contains a peptide that has antimicrobial activity against E. coli, P. aeruginosa, and C. albicans. Studies of a similar protein in mouse have assigned a variety of additional functions including nitrosylation of nuclear proteins, the regulation of mRNA stability, and acting as a transferrin receptor on the cell surface of macrophage. Many pseudogenes similar to this locus are present in the human genome. Alternative splicing results in multiple transcript variants. | GAPDH | ENSG00000111640 | NA |
| actin binding LIM protein 1 | 3983 | This gene encodes a cytoskeletal LIM protein that binds to actin filaments via a domain that is homologous to erythrocyte dematin. LIM domains, found in over 60 proteins, play key roles in the regulation of developmental pathways. LIM domains also function as protein-binding interfaces, mediating specific protein-protein interactions. The protein encoded by this gene could mediate such interactions between actin filaments and cytoplasmic targets. Alternatively spliced transcript variants encoding different isoforms have been identified. | ABLIM1 | ENSG00000099204 | NA |
| polycystin 1, transient receptor potential channel interacting | 5310 | This gene encodes a member of the polycystin protein family. The encoded glycoprotein contains a large N-terminal extracellular region, multiple transmembrane domains and a cytoplasmic C-tail. It is an integral membrane protein that functions as a regulator of calcium permeable cation channels and intracellular calcium homoeostasis. It is also involved in cell-cell/matrix interactions and may modulate G-protein-coupled signal-transduction pathways. It plays a role in renal tubular development, and mutations in this gene cause autosomal dominant polycystic kidney disease type 1 (ADPKD1). ADPKD1 is characterized by the growth of fluid-filled cysts that replace normal renal tissue and result in end-stage renal failure. Splice variants encoding different isoforms have been noted for this gene. Also, six pseudogenes, closely linked in a known duplicated region on chromosome 16p, have been described. | PKD1 | ENSG00000008710 | NA |
| glutathione peroxidase 3 | 2878 | This gene product belongs to the glutathione peroxidase family, which functions in the detoxification of hydrogen peroxide. It contains a selenocysteine (Sec) residue at its active site. The selenocysteine is encoded by the UGA codon, which normally signals translation termination. The 3’ UTR of Sec-containing genes have a common stem-loop structure, the sec insertion sequence (SECIS), which is necessary for the recognition of UGA as a Sec codon rather than as a stop signal. | GPX3 | ENSG00000211445 | NA |
| MAM domain containing glycosylphosphatidylinositol anchor 1 | 266727 | NA | MDGA1 | ENSG00000112139 | NA |
| F-box and leucine rich repeat protein 16 | 146330 | Members of the F-box protein family, such as FBXL16, are characterized by an approximately 40-amino acid F-box motif. SCF complexes, formed by SKP1 (MIM 601434), cullin (see CUL1; MIM 603134), and F-box proteins, act as protein-ubiquitin ligases. F-box proteins interact with SKP1 through the F box, and they interact with ubiquitination targets through other protein interaction domains (Jin et al., 2004 [PubMed 15520277]). | FBXL16 | ENSG00000127585 | NA |
| cerebellin 3 precursor | 643866 | Members of the precerebellin family, such as CBLN3, contain a cerebellin motif (see CBLN1; MIM 600432) and a C-terminal C1q signature domain (see MIM 120550) that mediates trimeric assembly of atypical collagen complexes. However, precerebellins do not contain a collagen motif, suggesting that they are not conventional components of the extracellular matrix (Pang et al., 2000 [PubMed 10964938]). | CBLN3 | ENSG00000139899 | NA |
| cortexin 1 | 404217 | NA | CTXN1 | ENSG00000178531 | NA |
| keratin 2 | 3849 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | KRT2 | ENSG00000172867 | NA |
| ectodermal-neural cortex 1 | 8507 | This gene encodes a member of the kelch-related family of actin-binding proteins. The encoded protein plays a role in the oxidative stress response as a regulator of the transcription factor Nrf2, and expression of this gene may play a role in malignant transformation. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | ENC1 | ENSG00000171617 | NA |
| pleckstrin and Sec7 domain containing | 5662 | This gene encodes a Plekstrin homology and SEC7 domains-containing protein that functions as a guanine nucleotide exchange factor. The encoded protein regulates signal transduction by activating ADP-ribosylation factor 6. Alternative splicing results in multiple transcript variants. | PSD | ENSG00000059915 | NA |
| NA | ENSG00000269968 | NA | RP5-940J5.9 | ENSG00000269968 | NA |
| chromogranin B | 1114 | This gene encodes a tyrosine-sulfated secretory protein abundant in peptidergic endocrine cells and neurons. This protein may serve as a precursor for regulatory peptides. | CHGB | ENSG00000089199 | NA |
| TIMP metallopeptidase inhibitor 2 | 7077 | This gene is a member of the TIMP gene family. The proteins encoded by this gene family are natural inhibitors of the matrix metalloproteinases, a group of peptidases involved in degradation of the extracellular matrix. In addition to an inhibitory role against metalloproteinases, the encoded protein has a unique role among TIMP family members in its ability to directly suppress the proliferation of endothelial cells. As a result, the encoded protein may be critical to the maintenance of tissue homeostasis by suppressing the proliferation of quiescent tissues in response to angiogenic factors, and by inhibiting protease activity in tissues undergoing remodelling of the extracellular matrix. | TIMP2 | ENSG00000035862 | NA |
| chimerin 1 | 1123 | This gene encodes GTPase-activating protein for ras-related p21-rac and a phorbol ester receptor. It is predominantly expressed in neurons, and plays an important role in neuronal signal-transduction mechanisms. Mutations in this gene are associated with Duane’s retraction syndrome 2 (DURS2). Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | CHN1 | ENSG00000128656 | NA |
| PERP, TP53 apoptosis effector | 64065 | NA | PERP | ENSG00000112378 | NA |
| desmoplakin | 1832 | This gene encodes a protein that anchors intermediate filaments to desmosomal plaques and forms an obligate component of functional desmosomes. Mutations in this gene are the cause of several cardiomyopathies and keratodermas, including skin fragility-woolly hair syndrome. Alternative splicing results in multiple transcript variants. | DSP | ENSG00000096696 | NA |
| NA | NA | NA | NA | ENSG00000163486 | TRUE |
| surfactant protein B | 6439 | This gene encodes the pulmonary-associated surfactant protein B (SPB), an amphipathic surfactant protein essential for lung function and homeostasis after birth. Pulmonary surfactant is a surface-active lipoprotein complex composed of 90% lipids and 10% proteins which include plasma proteins and apolipoproteins SPA, SPB, SPC and SPD. The surfactant is secreted by the alveolar cells of the lung and maintains the stability of pulmonary tissue by reducing the surface tension of fluids that coat the lung. The SPB enhances the rate of spreading and increases the stability of surfactant monolayers in vitro. Multiple mutations in this gene have been identified, which cause pulmonary surfactant metabolism dysfunction type 1, also called pulmonary alveolar proteinosis due to surfactant protein B deficiency, and are associated with fatal respiratory distress in the neonatal period. Alternatively spliced transcript variants encoding the same protein have been identified. | SFTPB | ENSG00000168878 | NA |
| actin, beta | 60 | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | ACTB | ENSG00000075624 | NA |
| T-cell lymphoma invasion and metastasis 1 | 7074 | NA | TIAM1 | ENSG00000156299 | NA |
| integral membrane protein 2C | 81618 | NA | ITM2C | ENSG00000135916 | NA |
| Ca2+ dependent secretion activator 2 | 93664 | This gene encodes a member of the calcium-dependent activator of secretion (CAPS) protein family, which are calcium binding proteins that regulate the exocytosis of synaptic and dense-core vesicles in neurons and neuroendocrine cells. Mutations in this gene may contribute to autism susceptibility. Multiple transcript variants encoding different isoforms have been found for this gene. | CADPS2 | ENSG00000081803 | NA |
| myosin, heavy chain 6, cardiac muscle, alpha | 4624 | Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. | MYH6 | ENSG00000197616 | NA |
| GNAS complex locus | 2778 | This locus has a highly complex imprinted expression pattern. It gives rise to maternally, paternally, and biallelically expressed transcripts that are derived from four alternative promoters and 5’ exons. Some transcripts contain a differentially methylated region (DMR) at their 5’ exons, and this DMR is commonly found in imprinted genes and correlates with transcript expression. An antisense transcript is produced from an overlapping locus on the opposite strand. One of the transcripts produced from this locus, and the antisense transcript, are paternally expressed noncoding RNAs, and may regulate imprinting in this region. In addition, one of the transcripts contains a second overlapping ORF, which encodes a structurally unrelated protein - Alex. Alternative splicing of downstream exons is also observed, which results in different forms of the stimulatory G-protein alpha subunit, a key element of the classical signal transduction pathway linking receptor-ligand interactions with the activation of adenylyl cyclase and a variety of cellular reponses. Multiple transcript variants encoding different isoforms have been found for this gene. Mutations in this gene result in pseudohypoparathyroidism type 1a, pseudohypoparathyroidism type 1b, Albright hereditary osteodystrophy, pseudopseudohypoparathyroidism, McCune-Albright syndrome, progressive osseus heteroplasia, polyostotic fibrous dysplasia of bone, and some pituitary tumors. | GNAS | ENSG00000087460 | NA |
| suppressor of glucose, autophagy associated 1 | 140710 | NA | SOGA1 | ENSG00000149639 | NA |
| surfactant protein A2 | 729238 | This gene is one of several genes encoding pulmonary-surfactant associated proteins (SFTPA) located on chromosome 10. Mutations in this gene and a highly similar gene located nearby, which affect the highly conserved carbohydrate recognition domain, are associated with idiopathic pulmonary fibrosis. The current version of the assembly displays only a single centromeric SFTPA gene pair rather than the two gene pairs shown in the previous assembly which were thought to have resulted from a duplication. | SFTPA2 | ENSG00000185303 | NA |
| chromogranin A | 1113 | The protein encoded by this gene is a member of the chromogranin/secretogranin family of neuroendocrine secretory proteins. It is found in secretory vesicles of neurons and endocrine cells. This gene product is a precursor to three biologically active peptides; vasostatin, pancreastatin, and parastatin. These peptides act as autocrine or paracrine negative modulators of the neuroendocrine system. Two other peptides, catestatin and chromofungin, have antimicrobial activity and antifungal activity, respectively. Two transcript variants encoding different isoforms have been found for this gene. | CHGA | ENSG00000100604 | NA |
| tetraspanin 9 | 10867 | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. Alternatively spliced transcripts encoding the same protein have been identified. | TSPAN9 | ENSG00000011105 | NA |
| VIM antisense RNA 1 | 100507347 | NA | VIM-AS1 | ENSG00000229124 | NA |
| NA | NA | NA | NA | ENSG00000117289 | TRUE |
| microtubule associated monooxygenase, calponin and LIM domain containing 2 | 9645 | NA | MICAL2 | ENSG00000133816 | NA |
| FK506 binding protein 5 | 2289 | The protein encoded by this gene is a member of the immunophilin protein family, which play a role in immunoregulation and basic cellular processes involving protein folding and trafficking. This encoded protein is a cis-trans prolyl isomerase that binds to the immunosuppressants FK506 and rapamycin. It is thought to mediate calcineurin inhibition. It also interacts functionally with mature hetero-oligomeric progesterone receptor complexes along with the 90 kDa heat shock protein and P23 protein. This gene has been found to have multiple polyadenylation sites. Alternative splicing results in multiple transcript variants. | FKBP5 | ENSG00000096060 | NA |
| keratin 14 | 3861 | This gene encodes a member of the keratin family, the most diverse group of intermediate filaments. This gene product, a type I keratin, is usually found as a heterotetramer with two keratin 5 molecules, a type II keratin. Together they form the cytoskeleton of epithelial cells. Mutations in the genes for these keratins are associated with epidermolysis bullosa simplex. At least one pseudogene has been identified at 17p12-p11. | KRT14 | ENSG00000186847 | NA |
| MCF.2 cell line derived transforming sequence like | 23263 | This gene encodes a guanine nucleotide exchange factor that interacts specifically with the GTP-bound Rac1 and plays a role in the Rho/Rac signaling pathways. A variant in this gene was associated with osteoarthritis. Alternative splicing results in multiple transcript variants. | MCF2L | ENSG00000126217 | NA |
| PTPRF interacting protein alpha 4 | 8497 | PPFIA4, or liprin-alpha-4, belongs to the liprin-alpha gene family. See liprin-alpha-1 (LIP1, or PPFIA1; MIM 611054) for background on liprins. | PPFIA4 | ENSG00000143847 | NA |
| transducin like enhancer of split 2 | 7089 | NA | TLE2 | ENSG00000065717 | NA |
| surfactant protein C | 6440 | This gene encodes the pulmonary-associated surfactant protein C (SPC), an extremely hydrophobic surfactant protein essential for lung function and homeostasis after birth. Pulmonary surfactant is a surface-active lipoprotein complex composed of 90% lipids and 10% proteins which include plasma proteins and apolipoproteins SPA, SPB, SPC and SPD. The surfactant is secreted by the alveolar cells of the lung and maintains the stability of pulmonary tissue by reducing the surface tension of fluids that coat the lung. Multiple mutations in this gene have been identified, which cause pulmonary surfactant metabolism dysfunction type 2, also called pulmonary alveolar proteinosis due to surfactant protein C deficiency, and are associated with interstitial lung disease in older infants, children, and adults. Alternatively spliced transcript variants encoding different protein isoforms have been identified. | SFTPC | ENSG00000168484 | NA |
| surfactant protein A1 | 653509 | This gene encodes a lung surfactant protein that is a member of a subfamily of C-type lectins called collectins. The encoded protein binds specific carbohydrate moieties found on lipids and on the surface of microorganisms. This protein plays an essential role in surfactant homeostasis and in the defense against respiratory pathogens. Mutations in this gene are associated with idiopathic pulmonary fibrosis. Alternate splicing results in multiple transcript variants. | SFTPA1 | ENSG00000122852 | NA |
| NA | ENSG00000234961 | NA | RP11-124N14.3 | ENSG00000234961 | NA |
| ERBB receptor feedback inhibitor 1 | 54206 | ERRFI1 is a cytoplasmic protein whose expression is upregulated with cell growth (Wick et al., 1995 [PubMed 7641805]). It shares significant homology with the protein product of rat gene-33, which is induced during cell stress and mediates cell signaling (Makkinje et al., 2000 [PubMed 10749885]; Fiorentino et al., 2000 [PubMed 11003669]). | ERRFI1 | ENSG00000116285 | NA |
| FK506 binding protein 8 | 23770 | The protein encoded by this gene is a member of the immunophilin protein family, which play a role in immunoregulation and basic cellular processes involving protein folding and trafficking. Unlike the other members of the family, this encoded protein does not seem to have PPIase/rotamase activity. It may have a role in neurons associated with memory function. | FKBP8 | ENSG00000105701 | NA |
| pyruvate kinase, muscle | 5315 | This gene encodes a protein involved in glycolysis. The encoded protein is a pyruvate kinase that catalyzes the transfer of a phosphoryl group from phosphoenolpyruvate to ADP, generating ATP and pyruvate. This protein has been shown to interact with thyroid hormone and may mediate cellular metabolic effects induced by thyroid hormones. This protein has been found to bind Opa protein, a bacterial outer membrane protein involved in gonococcal adherence to and invasion of human cells, suggesting a role of this protein in bacterial pathogenesis. Several alternatively spliced transcript variants encoding a few distinct isoforms have been reported. | PKM | ENSG00000067225 | NA |
| Thy-1 cell surface antigen | 7070 | This gene encodes a cell surface glycoprotein and member of the immunoglobulin superfamily of proteins. The encoded protein is involved in cell adhesion and cell communication in numerous cell types, but particularly in cells of the immune and nervous systems. The encoded protein is widely used as a marker for hematopoietic stem cells. This gene may function as a tumor suppressor in nasopharyngeal carcinoma. Alternative splicing results in multiple transcript variants. | THY1 | ENSG00000154096 | NA |
| natriuretic peptide A | 4878 | The protein encoded by this gene belongs to the natriuretic peptide family. Natriuretic peptides are implicated in the control of extracellular fluid volume and electrolyte homeostasis. This protein is synthesized as a large precursor (containing a signal peptide), which is processed to release a peptide from the N-terminus with similarity to vasoactive peptide, cardiodilatin, and another peptide from the C-terminus with natriuretic-diuretic activity. Mutations in this gene have been associated with atrial fibrillation familial type 6. This gene is located adjacent to another member of the natriuretic family of peptides on chromosome 1. | NPPA | ENSG00000175206 | NA |
| fatty acid synthase | 2194 | The enzyme encoded by this gene is a multifunctional protein. Its main function is to catalyze the synthesis of palmitate from acetyl-CoA and malonyl-CoA, in the presence of NADPH, into long-chain saturated fatty acids. In some cancer cell lines, this protein has been found to be fused with estrogen receptor-alpha (ER-alpha), in which the N-terminus of FAS is fused in-frame with the C-terminus of ER-alpha. | FASN | ENSG00000169710 | NA |
| ALS2, alsin Rho guanine nucleotide exchange factor | 57679 | The protein encoded by this gene contains an ATS1/RCC1-like domain, a RhoGEF domain, and a vacuolar protein sorting 9 (VPS9) domain, all of which are guanine-nucleotide exchange factors that activate members of the Ras superfamily of GTPases. The protein functions as a guanine nucleotide exchange factor for the small GTPase RAB5. The protein localizes with RAB5 on early endosomal compartments, and functions as a modulator for endosomal dynamics. Mutations in this gene result in several forms of juvenile lateral sclerosis and infantile-onset ascending spastic paralysis. Multiple transcript variants encoding different isoforms have been found for this gene. | ALS2 | ENSG00000003393 | NA |
| heat shock protein 90kDa alpha family class A member 1 | 3320 | The protein encoded by this gene is an inducible molecular chaperone that functions as a homodimer. The encoded protein aids in the proper folding of specific target proteins by use of an ATPase activity that is modulated by co-chaperones. Two transcript variants encoding different isoforms have been found for this gene. | HSP90AA1 | ENSG00000080824 | NA |
| collagen type XXVII alpha 1 | 85301 | This gene encodes a member of the fibrillar collagen family, and plays a role during the calcification of cartilage and the transition of cartilage to bone. The encoded protein product is a preproprotein. It includes an N-terminal signal peptide, which is followed by an N-terminal propetide, mature peptide and a C-terminal propeptide. The N-terminal propeptide contains thrombospondin N-terminal-like and laminin G-like domains. The mature peptide is a major triple-helical region. The C-terminal propeptide, also known as COLFI domain, plays crucial roles in tissue growth and repair. Mutations in this gene cause Steel syndrome. Alternatively spliced transcript variants have been found, but the full-length nature of some variants has not been determined. | COL27A1 | ENSG00000196739 | NA |
| suprabasin | 374897 | NA | SBSN | ENSG00000189001 | NA |
| forkhead box N3 | 1112 | This gene is a member of the forkhead/winged helix transcription factor family. Checkpoints are eukaryotic DNA damage-inducible cell cycle arrests at G1 and G2. Checkpoint suppressor 1 suppresses multiple yeast checkpoint mutations including mec1, rad9, rad53 and dun1 by activating a MEC1-independent checkpoint pathway. Alternative splicing is observed at the locus, resulting in distinct isoforms. | FOXN3 | ENSG00000053254 | NA |
| basic helix-loop-helix family member e40 | 8553 | This gene encodes a basic helix-loop-helix protein expressed in various tissues. The encoded protein can interact with ARNTL or compete for E-box binding sites in the promoter of PER1 and repress CLOCK/ARNTL’s transactivation of PER1. This gene is believed to be involved in the control of circadian rhythm and cell differentiation. | BHLHE40 | ENSG00000134107 | NA |
| pyruvate dehydrogenase kinase 4 | 5166 | This gene is a member of the PDK/BCKDK protein kinase family and encodes a mitochondrial protein with a histidine kinase domain. This protein is located in the matrix of the mitrochondria and inhibits the pyruvate dehydrogenase complex by phosphorylating one of its subunits, thereby contributing to the regulation of glucose metabolism. Expression of this gene is regulated by glucocorticoids, retinoic acid and insulin. | PDK4 | ENSG00000004799 | NA |
| integrin subunit alpha 5 | 3678 | The product of this gene belongs to the integrin alpha chain family. Integrins are heterodimeric integral membrane proteins composed of an alpha subunit and a beta subunit that function in cell surface adhesion and signaling. The encoded preproprotein is proteolytically processed to generate light and heavy chains that comprise the alpha 5 subunit. This subunit associates with the beta 1 subunit to form a fibronectin receptor. This integrin may promote tumor invasion, and higher expression of this gene may be correlated with shorter survival time in lung cancer patients. Note that the integrin alpha 5 and integrin alpha V subunits are encoded by distinct genes. | ITGA5 | ENSG00000161638 | NA |
| BAI1 associated protein 3 | 8938 | This p53-target gene encodes a brain-specific angiogenesis inhibitor. The protein is a seven-span transmembrane protein and a member of the secretin receptor family. It interacts with the cytoplasmic region of brain-specific angiogenesis inhibitor 1. This protein also contains two C2 domains, which are often found in proteins involved in signal transduction or membrane trafficking. Its expression pattern and similarity to other proteins suggest that it may be involved in synaptic functions. Several transcript variants encoding different isoforms have been found for this gene. | BAIAP3 | ENSG00000007516 | NA |
| laminin subunit alpha 5 | 3911 | This gene encodes one of the vertebrate laminin alpha chains. Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Laminins are composed of 3 non identical chains: laminin alpha, beta and gamma (formerly A, B1, and B2, respectively) and they form a cruciform structure consisting of 3 short arms, each formed by a different chain, and a long arm composed of all 3 chains. Each laminin chain is a multidomain protein encoded by a distinct gene. The protein encoded by this gene is the alpha-5 subunit of of laminin-10 (laminin-511), laminin-11 (laminin-521) and laminin-15 (laminin-523). | LAMA5 | ENSG00000130702 | NA |
| PATJ, crumbs cell polarity complex component | 10207 | This gene encodes a protein with multiple PDZ domains. PDZ domains mediate protein-protein interactions, and proteins with multiple PDZ domains often organize multimeric complexes at the plasma membrane. This protein localizes to tight junctions and to the apical membrane of epithelial cells. A similar protein in Drosophila is a scaffolding protein which tethers several members of a multimeric signaling complex in photoreceptors. | PATJ | ENSG00000132849 | NA |
| transglutaminase 2 | 7052 | Transglutaminases are enzymes that catalyze the crosslinking of proteins by epsilon-gamma glutamyl lysine isopeptide bonds. While the primary structure of transglutaminases is not conserved, they all have the same amino acid sequence at their active sites and their activity is calcium-dependent. The protein encoded by this gene acts as a monomer, is induced by retinoic acid, and appears to be involved in apoptosis. Finally, the encoded protein is the autoantigen implicated in celiac disease. Two transcript variants encoding different isoforms have been found for this gene. | TGM2 | ENSG00000198959 | NA |
| protein kinase (cAMP-dependent, catalytic) inhibitor beta | 5570 | This gene encodes a member of the cAMP-dependent protein kinase inhibitor family. The encoded protein may play a role in the protein kinase A (PKA) pathway by interacting with the catalytic subunit of PKA, and overexpression of this gene may play a role in prostate cancer. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | PKIB | ENSG00000135549 | NA |
| ornithine decarboxylase antizyme 1 | 4946 | The protein encoded by this gene belongs to the ornithine decarboxylase antizyme family, which plays a role in cell growth and proliferation by regulating intracellular polyamine levels. Expression of antizymes requires +1 ribosomal frameshifting, which is enhanced by high levels of polyamines. Antizymes in turn bind to and inhibit ornithine decarboxylase (ODC), the key enzyme in polyamine biosynthesis; thus, completing the auto-regulatory circuit. This gene encodes antizyme 1, the first member of the antizyme family, that has broad tissue distribution, and negatively regulates intracellular polyamine levels by binding to and targeting ODC for degradation, as well as inhibiting polyamine uptake. Antizyme 1 mRNA contains two potential in-frame AUGs; and studies in rat suggest that alternative use of the two translation initiation sites results in N-terminally distinct protein isoforms with different subcellular localization. Alternatively spliced transcript variants have also been noted for this gene. | OAZ1 | ENSG00000104904 | NA |
| dermokine | 93099 | This gene is upregulated in inflammatory diseases, and it was first observed as expressed in the differentiated layers of skin. The most interesting aspect of this gene is the differential use of promoters and terminators to generate isoforms with unique cellular distributions and domain components. Alternatively spliced transcript variants encoding different isoforms have been identified for this gene. | DMKN | ENSG00000161249 | NA |
| calcium/calmodulin dependent protein kinase II inhibitor 1 | 55450 | NA | CAMK2N1 | ENSG00000162545 | NA |
| chromodomain helicase DNA binding protein 7 | 55636 | This gene encodes a protein that contains several helicase family domains. Mutations in this gene have been found in some patients with the CHARGE syndrome. Two transcript variants encoding different isoforms have been found for this gene. | CHD7 | ENSG00000171316 | NA |
| calmodulin like 5 | 51806 | This gene encodes a novel calcium binding protein expressed in the epidermis and related to the calmodulin family of calcium binding proteins. Functional studies with recombinant protein demonstrate it does bind calcium and undergoes a conformational change when it does so. Abundant expression is detected only in reconstructed epidermis and is restricted to differentiating keratinocytes. In addition, it can associate with transglutaminase 3, shown to be a key enzyme in the terminal differentiation of keratinocytes. | CALML5 | ENSG00000178372 | NA |
| protein disulfide isomerase family A member 2 | 64714 | Protein disulfide isomerases (EC 5.3.4.1), such as PDIP, are endoplasmic reticulum (ER) resident proteins that catalyze protein folding and thiol-disulfide interchange reactions (Desilva et al., 1996 [PubMed 8561901]). | PDIA2 | ENSG00000185615 | NA |
| plakophilin 1 | 5317 | This gene encodes a member of the arm-repeat (armadillo) and plakophilin gene families. Plakophilin proteins contain numerous armadillo repeats, localize to cell desmosomes and nuclei, and participate in linking cadherins to intermediate filaments in the cytoskeleton. This protein may be involved in molecular recruitment and stabilization during desmosome formation. Mutations in this gene have been associated with the ectodermal dysplasia/skin fragility syndrome. Two transcript variants encoding different isoforms have been found for this gene. | PKP1 | ENSG00000081277 | NA |
| aldolase, fructose-bisphosphate A | 226 | The protein encoded by this gene, Aldolase A (fructose-bisphosphate aldolase), is a glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Three aldolase isozymes (A, B, and C), encoded by three different genes, are differentially expressed during development. Aldolase A is found in the developing embryo and is produced in even greater amounts in adult muscle. Aldolase A expression is repressed in adult liver, kidney and intestine and similar to aldolase C levels in brain and other nervous tissue. Aldolase A deficiency has been associated with myopathy and hemolytic anemia. Alternative splicing and alternative promoter usage results in multiple transcript variants. Related pseudogenes have been identified on chromosomes 3 and 10. | ALDOA | ENSG00000149925 | NA |
| proline rich coiled-coil 2A | 7916 | A cluster of genes, BAT1-BAT5, has been localized in the vicinity of the genes for TNF alpha and TNF beta. These genes are all within the human major histocompatibility complex class III region. This gene has microsatellite repeats which are associated with the age-at-onset of insulin-dependent diabetes mellitus (IDDM) and possibly thought to be involved with the inflammatory process of pancreatic beta-cell destruction during the development of IDDM. This gene is also a candidate gene for the development of rheumatoid arthritis. Two transcript variants encoding the same protein have been found for this gene. | PRRC2A | ENSG00000204469 | NA |
| solute carrier family 38 member 1 | 81539 | Amino acid transporters play essential roles in the uptake of nutrients, production of energy, chemical metabolism, detoxification, and neurotransmitter cycling. SLC38A1 is an important transporter of glutamine, an intermediate in the detoxification of ammonia and the production of urea. Glutamine serves as a precursor for the synaptic transmitter, glutamate (Gu et al., 2001 [PubMed 11325958]). | SLC38A1 | ENSG00000111371 | NA |
| adenosine deaminase, RNA specific B1 | 104 | This gene encodes the enzyme responsible for pre-mRNA editing of the glutamate receptor subunit B by site-specific deamination of adenosines. Studies in rat found that this enzyme acted on its own pre-mRNA molecules to convert an AA dinucleotide to an AI dinucleotide which resulted in a new splice site. Alternative splicing of this gene results in several transcript variants, some of which have been characterized by the presence or absence of an ALU cassette insert and a short or long C-terminal region. | ADARB1 | ENSG00000197381 | NA |
| heat shock protein family B (small) member 7 | 27129 | NA | HSPB7 | ENSG00000173641 | NA |
| tropomyosin 3 | 7170 | This gene encodes a member of the tropomyosin family of actin-binding proteins. Tropomyosins are dimers of coiled-coil proteins that provide stability to actin filaments and regulate access of other actin-binding proteins. Mutations in this gene result in autosomal dominant nemaline myopathy and other muscle disorders. This locus is involved in translocations with other loci, including anaplastic lymphoma receptor tyrosine kinase (ALK) and neurotrophic tyrosine kinase receptor type 1 (NTRK1), which result in the formation of fusion proteins that act as oncogenes. There are numerous pseudogenes for this gene on different chromosomes. Alternative splicing results in multiple transcript variants. | TPM3 | ENSG00000143549 | NA |
| ubiquitin C-terminal hydrolase L1 | 7345 | The protein encoded by this gene belongs to the peptidase C12 family. This enzyme is a thiol protease that hydrolyzes a peptide bond at the C-terminal glycine of ubiquitin. This gene is specifically expressed in the neurons and in cells of the diffuse neuroendocrine system. Mutations in this gene may be associated with Parkinson disease. | UCHL1 | ENSG00000154277 | NA |
| loricrin | 4014 | This gene encodes loricrin, a major protein component of the cornified cell envelope found in terminally differentiated epidermal cells. Mutations in this gene are associated with Vohwinkel’s syndrome and progressive symmetric erythrokeratoderma, both inherited skin diseases. | LOR | ENSG00000203782 | NA |
| NA | ENSG00000271795 | NA | CTC-251D13.1 | ENSG00000271795 | NA |
| tyrosine kinase non receptor 2 | 10188 | This gene encodes a tyrosine kinase that binds Cdc42Hs in its GTP-bound form and inhibits both the intrinsic and GTPase-activating protein (GAP)-stimulated GTPase activity of Cdc42Hs. This binding is mediated by a unique sequence of 47 amino acids C-terminal to an SH3 domain. The protein may be involved in a regulatory mechanism that sustains the GTP-bound active form of Cdc42Hs and which is directly linked to a tyrosine phosphorylation signal transduction pathway. Several alternatively spliced transcript variants have been identified from this gene, but the full-length nature of only two transcript variants has been determined. | TNK2 | ENSG00000061938 | NA |
| regulator of G-protein signaling 14 | 10636 | This gene encodes a member of the regulator of G-protein signaling family. This protein contains one RGS domain, two Raf-like Ras-binding domains (RBDs), and one GoLoco domain. The protein attenuates the signaling activity of G-proteins by binding, through its GoLoco domain, to specific types of activated, GTP-bound G alpha subunits. Acting as a GTPase activating protein (GAP), the protein increases the rate of conversion of the GTP to GDP. This hydrolysis allows the G alpha subunits to bind G beta/gamma subunit heterodimers, forming inactive G-protein heterotrimers, thereby terminating the signal. Alternate transcriptional splice variants of this gene have been observed but have not been thoroughly characterized. | RGS14 | ENSG00000169220 | NA |
| fibrosin | 64319 | Fibrosin is a lymphokine secreted by activated lymphocytes that induces fibroblast proliferation (Prakash and Robbins, 1998 [PubMed 9809749]). | FBRS | ENSG00000156860 | NA |
| microtubule associated serine/threonine kinase 3 | 23031 | NA | MAST3 | ENSG00000099308 | NA |
| CAP-Gly domain containing linker protein 1 | 6249 | The protein encoded by this gene links endocytic vesicles to microtubules. This gene is highly expressed in Reed-Sternberg cells of Hodgkin disease. Several transcript variants encoding different isoforms have been found for this gene. | CLIP1 | ENSG00000130779 | NA |
| metallothionein 3 | 4504 | NA | MT3 | ENSG00000087250 | NA |
| Kruppel like factor 9 | 687 | The protein encoded by this gene is a transcription factor that binds to GC box elements located in the promoter. Binding of the encoded protein to a single GC box inhibits mRNA expression while binding to tandemly repeated GC box elements activates transcription. | KLF9 | ENSG00000119138 | NA |
| low density lipoprotein receptor adaptor protein 1 | 26119 | The protein encoded by this gene is a cytosolic protein which contains a phosphotyrosine binding (PTD) domain. The PTD domain has been found to interact with the cytoplasmic tail of the LDL receptor. Mutations in this gene lead to LDL receptor malfunction and cause the disorder autosomal recessive hypercholesterolaemia. | LDLRAP1 | ENSG00000157978 | NA |
| elastin | 2006 | This gene encodes a protein that is one of the two components of elastic fibers. The encoded protein is rich in hydrophobic amino acids such as glycine and proline, which form mobile hydrophobic regions bounded by crosslinks between lysine residues. Deletions and mutations in this gene are associated with supravalvular aortic stenosis (SVAS) and autosomal dominant cutis laxa. Multiple transcript variants encoding different isoforms have been found for this gene. | ELN | ENSG00000049540 | NA |
| ATH1, acid trehalase-like 1 (yeast) | 80162 | NA | ATHL1 | ENSG00000142102 | NA |
| FXYD domain containing ion transport regulator 7 | 53822 | This reference sequence was derived from multiple replicate ESTs and validated by similar human genomic sequence. This gene encodes a member of a family of small membrane proteins that share a 35-amino acid signature sequence domain, beginning with the sequence PFXYD and containing 7 invariant and 6 highly conserved amino acids. The approved human gene nomenclature for the family is FXYD-domain containing ion transport regulator. Transmembrane topology has been established for two family members (FXYD1 and FXYD2), with the N-terminus extracellular and the C-terminus on the cytoplasmic side of the membrane. FXYD2, also known as the gamma subunit of the Na,K-ATPase, regulates the properties of that enzyme. FXYD1 (phospholemman), FXYD2 (gamma), FXYD3 (MAT-8), FXYD4 (CHIF), and FXYD5 (RIC) have been shown to induce channel activity in experimental expression systems. This gene product, FXYD7, is novel and has not been characterized as a protein. [RefSeq curation by Kathleen J. Sweadner, Ph.D., sweadner@helix.mgh.harvard.edu., Dec 2000]. | FXYD7 | ENSG00000221946 | NA |
| calmodulin 2 (phosphorylase kinase, delta) | 805 | This gene is a member of the calmodulin gene family. There are three distinct calmodulin genes dispersed throughout the genome that encode the identical protein, but differ at the nucleotide level. Calmodulin is a calcium binding protein that plays a role in signaling pathways, cell cycle progression and proliferation. Several infants with severe forms of long-QT syndrome (LQTS) who displayed life-threatening ventricular arrhythmias together with delayed neurodevelopment and epilepsy were found to have mutations in either this gene or another member of the calmodulin gene family (PMID:23388215). Mutations in this gene have also been identified in patients with less severe forms of LQTS (PMID:24917665), while mutations in another calmodulin gene family member have been associated with catecholaminergic polymorphic ventricular tachycardia (CPVT)(PMID:23040497), a rare disorder thought to be the cause of a significant fraction of sudden cardiac deaths in young individuals. Pseudogenes of this gene are found on chromosomes 10, 13, and 17. Alternative splicing results in multiple transcript variants encoding different isoforms. | CALM2 | ENSG00000143933 | NA |
| major histocompatibility complex, class I, B | 3106 | HLA-B belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. Class I molecules play a central role in the immune system by presenting peptides derived from the endoplasmic reticulum lumen. They are expressed in nearly all cells. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon 1 encodes the leader peptide, exon 2 and 3 encode the alpha1 and alpha2 domains, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region and exons 6 and 7 encode the cytoplasmic tail. Polymorphisms within exon 2 and exon 3 are responsible for the peptide binding specificity of each class one molecule. Typing for these polymorphisms is routinely done for bone marrow and kidney transplantation. Hundreds of HLA-B alleles have been described. | HLA-B | ENSG00000234745 | NA |
| murine retrovirus integration site 1 homolog | 10335 | This gene is similar to a putative mouse tumor suppressor gene (Mrvi1) that is frequently disrupted by mouse AIDS-related virus (MRV). The encoded protein, which is found in the membrane of the endoplasmic reticulum, is similar to Jaw1, a lymphoid-restricted protein whose expression is down-regulated during lymphoid differentiation. This protein is a substrate of cGMP-dependent kinase-1 (PKG1) that can function as a regulator of IP3-induced calcium release. Studies in mouse suggest that MRV integration at Mrvi1 induces myeloid leukemia by altering the expression of a gene important for myeloid cell growth and/or differentiation, and thus this gene may function as a myeloid leukemia tumor suppressor gene. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene, and alternative translation start sites, including a non-AUG (CUG) start site, are used. | MRVI1 | ENSG00000072952 | NA |
| interferon induced transmembrane protein 3 | 10410 | The protein encoded by this gene is an interferon-induced membrane protein that helps confer immunity to influenza A H1N1 virus, West Nile virus, and dengue virus. Two transcript variants, only one of them protein-coding, have been found for this gene. Another variant encoding an N-terminally truncated isoform has been reported, but the full-length nature of this variant has not been determined. | IFITM3 | ENSG00000142089 | NA |
| metallothionein 1X | 4501 | NA | MT1X | ENSG00000187193 | NA |
| transmembrane protein 178A | 130733 | NA | TMEM178A | ENSG00000152154 | NA |
| intercellular adhesion molecule 5 | 7087 | The protein encoded by this gene is a member of the intercellular adhesion molecule (ICAM) family. All ICAM proteins are type I transmembrane glycoproteins, contain 2-9 immunoglobulin-like C2-type domains, and bind to the leukocyte adhesion LFA-1 protein. This protein is expressed on the surface of telencephalic neurons and displays two types of adhesion activity, homophilic binding between neurons and heterophilic binding between neurons and leukocytes. It may be a critical component in neuron-microglial cell interactions in the course of normal development or as part of neurodegenerative diseases. | ICAM5 | ENSG00000105376 | NA |
| basigin (Ok blood group) | 682 | The protein encoded by this gene is a plasma membrane protein that is important in spermatogenesis, embryo implantation, neural network formation, and tumor progression. The encoded protein is also a member of the immunoglobulin superfamily. Multiple transcript variants encoding different isoforms have been found for this gene. | BSG | ENSG00000172270 | NA |
| collagen type VI alpha 2 | 1292 | This gene encodes one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The product of this gene contains several domains similar to von Willebrand Factor type A domains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in this gene are associated with Bethlem myopathy and Ullrich scleroatonic muscular dystrophy. Three transcript variants have been identified for this gene. | COL6A2 | ENSG00000142173 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",11,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[12,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| summary | X_id | query | symbol | name |
|---|---|---|---|---|
| This gene encodes a member of the intermediate filament family. Intermediate filamentents, along with microtubules and actin microfilaments, make up the cytoskeleton. The protein encoded by this gene is responsible for maintaining cell shape, integrity of the cytoplasm, and stabilizing cytoskeletal interactions. It is also involved in the immune response, and controls the transport of low-density lipoprotein (LDL)-derived cholesterol from a lysosome to the site of esterification. It functions as an organizer of a number of critical proteins involved in attachment, migration, and cell signaling. Mutations in this gene causes a dominant, pulverulent cataract. | 7431 | ENSG00000026025 | VIM | vimentin |
| This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | 1674 | ENSG00000175084 | DES | desmin |
| This gene encodes one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The product of this gene contains several domains similar to von Willebrand Factor type A domains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in this gene are associated with Bethlem myopathy and Ullrich scleroatonic muscular dystrophy. Three transcript variants have been identified for this gene. | 1292 | ENSG00000142173 | COL6A2 | collagen type VI alpha 2 |
| Protamines substitute for histones in the chromatin of sperm during the haploid phase of spermatogenesis, and are the major DNA-binding proteins in the nucleus of sperm in many vertebrates. They package the sperm DNA into a highly condensed complex in a volume less than 5% of a somatic cell nucleus. Many mammalian species have only one protamine (protamine 1); however, a few species, including human and mouse, have two. This gene encodes protamine 2, which is cleaved to give rise to a family of protamine 2 peptides. Alternatively spliced transcript variants have also been found for this gene. | 5620 | ENSG00000122304 | PRM2 | protamine 2 |
| This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | 3858 | ENSG00000186395 | KRT10 | keratin 10 |
| This gene encodes the light subunit of the ferritin protein. Ferritin is the major intracellular iron storage protein in prokaryotes and eukaryotes. It is composed of 24 subunits of the heavy and light ferritin chains. Variation in ferritin subunit composition may affect the rates of iron uptake and release in different tissues. A major function of ferritin is the storage of iron in a soluble and nontoxic state. Defects in this light chain ferritin gene are associated with several neurodegenerative diseases and hyperferritinemia-cataract syndrome. This gene has multiple pseudogenes. | 2512 | ENSG00000087086 | FTL | ferritin, light polypeptide |
| Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. | 4624 | ENSG00000197616 | MYH6 | myosin, heavy chain 6, cardiac muscle, alpha |
| This gene encodes a protein that anchors intermediate filaments to desmosomal plaques and forms an obligate component of functional desmosomes. Mutations in this gene are the cause of several cardiomyopathies and keratodermas, including skin fragility-woolly hair syndrome. Alternative splicing results in multiple transcript variants. | 1832 | ENSG00000096696 | DSP | desmoplakin |
| NA | 64065 | ENSG00000112378 | PERP | PERP, TP53 apoptosis effector |
| NA | 5619 | ENSG00000175646 | PRM1 | protamine 1 |
| This gene encodes a member of the small leucine-rich proteoglycan family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature protein. This protein plays a role in collagen fibril assembly. Binding of this protein to multiple cell surface receptors mediates its role in tumor suppression, including a stimulatory effect on autophagy and inflammation and an inhibitory effect on angiogenesis and tumorigenesis. This gene and the related gene biglycan are thought to be the result of a gene duplication. Mutations in this gene are associated with congenital stromal corneal dystrophy in human patients. | 1634 | ENSG00000011465 | DCN | decorin |
| The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | 3043 | ENSG00000244734 | HBB | hemoglobin subunit beta |
| This gene encodes a member of the arm-repeat (armadillo) and plakophilin gene families. Plakophilin proteins contain numerous armadillo repeats, localize to cell desmosomes and nuclei, and participate in linking cadherins to intermediate filaments in the cytoskeleton. This protein may be involved in molecular recruitment and stabilization during desmosome formation. Mutations in this gene have been associated with the ectodermal dysplasia/skin fragility syndrome. Two transcript variants encoding different isoforms have been found for this gene. | 5317 | ENSG00000081277 | PKP1 | plakophilin 1 |
| Fibulin 1 is a secreted glycoprotein that becomes incorporated into a fibrillar extracellular matrix. Calcium-binding is apparently required to mediate its binding to laminin and nidogen. It mediates platelet adhesion via binding fibrinogen. Four splice variants which differ in the 3’ end have been identified. Each variant encodes a different isoform, but no functional distinctions have been identified among the four variants. | 2192 | ENSG00000077942 | FBLN1 | fibulin 1 |
| This gene encodes the pro-alpha1 chains of type III collagen, a fibrillar collagen that is found in extensible connective tissues such as skin, lung, uterus, intestine and the vascular system, frequently in association with type I collagen. Mutations in this gene are associated with Ehlers-Danlos syndrome types IV, and with aortic and arterial aneurysms. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | 1281 | ENSG00000168542 | COL3A1 | collagen type III alpha 1 chain |
| This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIA, Ehlers-Danlos syndrome Classical type, Caffey Disease and idiopathic osteoporosis. Reciprocal translocations between chromosomes 17 and 22, where this gene and the gene for platelet-derived growth factor beta are located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans, resulting from unregulated expression of the growth factor. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | 1277 | ENSG00000108821 | COL1A1 | collagen type I alpha 1 |
| The protein encoded by this gene belongs to the natriuretic peptide family. Natriuretic peptides are implicated in the control of extracellular fluid volume and electrolyte homeostasis. This protein is synthesized as a large precursor (containing a signal peptide), which is processed to release a peptide from the N-terminus with similarity to vasoactive peptide, cardiodilatin, and another peptide from the C-terminus with natriuretic-diuretic activity. Mutations in this gene have been associated with atrial fibrillation familial type 6. This gene is located adjacent to another member of the natriuretic family of peptides on chromosome 1. | 4878 | ENSG00000175206 | NPPA | natriuretic peptide A |
| This gene encodes a member of the keratin family, the most diverse group of intermediate filaments. This gene product, a type I keratin, is usually found as a heterotetramer with two keratin 5 molecules, a type II keratin. Together they form the cytoskeleton of epithelial cells. Mutations in the genes for these keratins are associated with epidermolysis bullosa simplex. At least one pseudogene has been identified at 17p12-p11. | 3861 | ENSG00000186847 | KRT14 | keratin 14 |
| The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the basal layer of the epidermis with family member KRT14. Mutations in these genes have been associated with a complex of diseases termed epidermolysis bullosa simplex. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3852 | ENSG00000186081 | KRT5 | keratin 5 |
| The secreted protein encoded by this gene is growth factor-inducible and promotes the adhesion of endothelial cells. The encoded protein interacts with several integrins and with heparan sulfate proteoglycan. This protein also plays a role in cell proliferation, differentiation, angiogenesis, apoptosis, and extracellular matrix formation. | 3491 | ENSG00000142871 | CYR61 | cysteine rich angiogenic inducer 61 |
| This gene encodes a major cytoplasmic protein which is the only known constituent common to submembranous plaques of both desmosomes and intermediate junctions. This protein forms distinct complexes with cadherins and desmosomal cadherins and is a member of the catenin family since it contains a distinct repeating amino acid motif called the armadillo repeat. Mutation in this gene has been associated with Naxos disease. Alternative splicing occurs in this gene; however, not all transcripts have been fully described. | 3728 | ENSG00000173801 | JUP | junction plakoglobin |
| NA | 374897 | ENSG00000189001 | SBSN | suprabasin |
| NA | 100507347 | ENSG00000229124 | VIM-AS1 | VIM antisense RNA 1 |
| This gene encodes the alpha-3 chain, one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The alpha-3 chain of type VI collagen is much larger than the alpha-1 and -2 chains. This difference in size is largely due to an increase in the number of subdomains, similar to von Willebrand Factor type A domains, that are found in the amino terminal globular domain of all the alpha chains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in the type VI collagen genes are associated with Bethlem myopathy, a rare autosomal dominant proximal myopathy with early childhood onset. Mutations in this gene are also a cause of Ullrich congenital muscular dystrophy, also referred to as Ullrich scleroatonic muscular dystrophy, an autosomal recessive congenital myopathy that is more severe than Bethlem myopathy. Multiple transcript variants have been identified, but the full-length nature of only some of these variants has been described. | 1293 | ENSG00000163359 | COL6A3 | collagen type VI alpha 3 chain |
| The protein encoded by this gene is secreted and likely acts as an inhibitor of bone formation. The encoded protein is found in the organic matrix of bone and cartilage. Defects in this gene are a cause of Keutel syndrome (KS). Two transcript variants encoding different isoforms have been found for this gene. | 4256 | ENSG00000111341 | MGP | matrix Gla protein |
| NA | ENSG00000237973 | ENSG00000237973 | MTCO1P12 | MT-CO1 pseudogene 12 |
| The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3849 | ENSG00000172867 | KRT2 | keratin 2 |
| This gene encodes a highly glycosylated plasma protein involved in the regulation of the complement cascade. Its protein inhibits activated C1r and C1s of the first complement component and thus regulates complement activation. Deficiency of this protein is associated with hereditary angioneurotic oedema (HANE). Alternative splicing results in multiple transcript variants encoding the same isoform. | 710 | ENSG00000149131 | SERPING1 | serpin family G member 1 |
| This gene encodes one of three related filamin genes, specifically gamma filamin. These filamin proteins crosslink actin filaments into orthogonal networks in cortical cytoplasm and participate in the anchoring of membrane proteins for the actin cytoskeleton. Three functional domains exist in filamin: an N-terminal filamentous actin-binding domain, a C-terminal self-association domain, and a membrane glycoprotein-binding domain. Two transcript variants encoding different isoforms have been found for this gene. | 2318 | ENSG00000128591 | FLNC | filamin C |
| The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3848 | ENSG00000167768 | KRT1 | keratin 1 |
| This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | 60 | ENSG00000075624 | ACTB | actin, beta |
| NA | ENSG00000234961 | ENSG00000234961 | RP11-124N14.3 | NA |
| This gene encodes a protein with similarity to follistatin, an activin-binding protein. It contains an FS module, a follistatin-like sequence containing 10 conserved cysteine residues. This gene product is thought to be an autoantigen associated with rheumatoid arthritis. | 11167 | ENSG00000163430 | FSTL1 | follistatin like 1 |
| The cytoplasmic peripheral membrane protein encoded by this gene functions as a protein-tyrosine kinase substrate in microvilli. As a member of the ERM protein family, this protein serves as an intermediate between the plasma membrane and the actin cytoskeleton. This protein plays a key role in cell surface structure adhesion, migration and organization, and it has been implicated in various human cancers. A pseudogene located on chromosome 3 has been identified for this gene. Alternatively spliced variants have also been described for this gene. | 7430 | ENSG00000092820 | EZR | ezrin |
| This gene encodes a protein that enables the dissociation of paused ternary polymerase I transcription complexes from the 3’ end of pre-rRNA transcripts. This protein regulates rRNA transcription by promoting the dissociation of transcription complexes and the reinitiation of polymerase I on nascent rRNA transcripts. This protein also localizes to caveolae at the plasma membrane and is thought to play a critical role in the formation of caveolae and the stabilization of caveolins. This protein translocates from caveolae to the cytoplasm after insulin stimulation. Caveolae contain truncated forms of this protein and may be the site of phosphorylation-dependent proteolysis. This protein is also thought to modify lipid metabolism and insulin-regulated gene expression. Mutations in this gene result in a disorder characterized by generalized lipodystrophy and muscular dystrophy. | 284119 | ENSG00000177469 | PTRF | polymerase I and transcript release factor |
| Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC). | 70 | ENSG00000159251 | ACTC1 | actin, alpha, cardiac muscle 1 |
| NA | 27129 | ENSG00000173641 | HSPB7 | heat shock protein family B (small) member 7 |
| This gene is a member of the insulin-like growth factor binding protein (IGFBP) family and encodes a protein with an IGFBP domain and a thyroglobulin type-I domain. The protein binds both insulin-like growth factors (IGFs) I and II and circulates in the plasma in both glycosylated and non-glycosylated forms. Binding of this protein prolongs the half-life of the IGFs and alters their interaction with cell surface receptors. | 3487 | ENSG00000141753 | IGFBP4 | insulin like growth factor binding protein 4 |
| This gene encodes a member of the globin superfamily and is expressed in skeletal and cardiac muscles. The encoded protein is a haemoprotein contributing to intracellular oxygen storage and transcellular facilitated diffusion of oxygen. At least three alternatively spliced transcript variants encoding the same protein have been reported. | 4151 | ENSG00000198125 | MB | myoglobin |
| Spectrins are principle components of a cell’s membrane-cytoskeleton and are composed of two alpha and two beta spectrin subunits. The protein encoded by this gene (SPTBN2), is called spectrin beta non-erythrocytic 2 or beta-III spectrin. It is related to, but distinct from, the beta-II spectrin gene which is also known as spectrin beta non-erythrocytic 1 (SPTBN1). SPTBN2 regulates the glutamate signaling pathway by stabilizing the glutamate transporter EAAT4 at the surface of the plasma membrane. Mutations in this gene cause a form of spinocerebellar ataxia, SCA5, that is characterized by neurodegeneration, progressive locomotor incoordination, dysarthria, and uncoordinated eye movements. | 6712 | ENSG00000173898 | SPTBN2 | spectrin beta, non-erythrocytic 2 |
| NA | ENSG00000225630 | ENSG00000225630 | MTND2P28 | mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 2 pseudogene 28 |
| The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and altered expression of this protein is associated with the disease cystic fibrosis. This antimicrobial protein exhibits antifungal and antibacterial activity. | 6280 | ENSG00000163220 | S100A9 | S100 calcium binding protein A9 |
| NA | 715 | ENSG00000159403 | C1R | complement C1r subcomponent |
| NA | ENSG00000211895 | ENSG00000211895 | IGHA1 | immunoglobulin heavy constant alpha 1 |
| Mammalian lens crystallins are divided into alpha, beta, and gamma families. Alpha crystallins are composed of two gene products: alpha-A and alpha-B, for acidic and basic, respectively. Alpha crystallins can be induced by heat shock and are members of the small heat shock protein (HSP20) family. They act as molecular chaperones although they do not renature proteins and release them in the fashion of a true chaperone; instead they hold them in large soluble aggregates. Post-translational modifications decrease the ability to chaperone. These heterogeneous aggregates consist of 30-40 subunits; the alpha-A and alpha-B subunits have a 3:1 ratio, respectively. Two additional functions of alpha crystallins are an autokinase activity and participation in the intracellular architecture. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. Alpha-A and alpha-B gene products are differentially expressed; alpha-A is preferentially restricted to the lens and alpha-B is expressed widely in many tissues and organs. Elevated expression of alpha-B crystallin occurs in many neurological diseases; a missense mutation cosegregated in a family with a desmin-related myopathy. Alternative splicing results in multiple transcript variants. | 1410 | ENSG00000109846 | CRYAB | crystallin alpha B |
| This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | 2335 | ENSG00000115414 | FN1 | fibronectin 1 |
| This gene product belongs to the 14-3-3 family of proteins which mediate signal transduction by binding to phosphoserine-containing proteins. This highly conserved protein family is found in both plants and mammals, and this protein is 100% identical to the mouse ortholog. It interacts with CDC25 phosphatases, RAF1 and IRS1 proteins, suggesting its role in diverse biochemical activities related to signal transduction, such as cell division and regulation of insulin sensitivity. It has also been implicated in the pathogenesis of small cell lung cancer. Two transcript variants, one protein-coding and the other non-protein-coding, have been found for this gene. | 7531 | ENSG00000108953 | YWHAE | tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein epsilon |
| This gene is a member of the matrix metalloproteinase (MMP) gene family, that are zinc-dependent enzymes capable of cleaving components of the extracellular matrix and molecules involved in signal transduction. The protein encoded by this gene is a gelatinase A, type IV collagenase, that contains three fibronectin type II repeats in its catalytic site that allow binding of denatured type IV and V collagen and elastin. Unlike most MMP family members, activation of this protein can occur on the cell membrane. This enzyme can be activated extracellularly by proteases, or, intracellulary by its S-glutathiolation with no requirement for proteolytical removal of the pro-domain. This protein is thought to be involved in multiple pathways including roles in the nervous system, endometrial menstrual breakdown, regulation of vascularization, and metastasis. Mutations in this gene have been associated with Winchester syndrome and Nodulosis-Arthropathy-Osteolysis (NAO) syndrome. Alternative splicing results in multiple transcript variants encoding different isoforms. | 4313 | ENSG00000087245 | MMP2 | matrix metallopeptidase 2 |
| This gene is a member of the N-myc downregulated gene family which belongs to the alpha/beta hydrolase superfamily. The protein encoded by this gene is a cytoplasmic protein that may play a role in neurite outgrowth. This gene may be involved in glioblastoma carcinogenesis. Several alternatively spliced transcript variants of this gene have been described, but the full-length nature of some of these variants has not been determined. | 57447 | ENSG00000165795 | NDRG2 | NDRG family member 2 |
| Complement component C3 plays a central role in the activation of complement system. Its activation is required for both classical and alternative complement activation pathways. The encoded preproprotein is proteolytically processed to generate alpha and beta subunits that form the mature protein, which is then further processed to generate numerous peptide products. The C3a peptide, also known as the C3a anaphylatoxin, modulates inflammation and possesses antimicrobial activity. Mutations in this gene are associated with atypical hemolytic uremic syndrome and age-related macular degeneration in human patients. | 718 | ENSG00000125730 | C3 | complement component 3 |
| This gene encodes a cell surface tyrosine kinase receptor for members of the platelet-derived growth factor family. These growth factors are mitogens for cells of mesenchymal origin. The identity of the growth factor bound to a receptor monomer determines whether the functional receptor is a homodimer or a heterodimer, composed of both platelet-derived growth factor receptor alpha and beta polypeptides. This gene is flanked on chromosome 5 by the genes for granulocyte-macrophage colony-stimulating factor and macrophage-colony stimulating factor receptor; all three genes may be implicated in the 5-q syndrome. A translocation between chromosomes 5 and 12, that fuses this gene to that of the translocation, ETV6, leukemia gene, results in chronic myeloproliferative disorder with eosinophilia. | 5159 | ENSG00000113721 | PDGFRB | platelet derived growth factor receptor beta |
| NA | 100129518 | ENSG00000112096 | LOC100129518 | uncharacterized LOC100129518 |
| This gene is a member of the iron/manganese superoxide dismutase family. It encodes a mitochondrial protein that forms a homotetramer and binds one manganese ion per subunit. This protein binds to the superoxide byproducts of oxidative phosphorylation and converts them to hydrogen peroxide and diatomic oxygen. Mutations in this gene have been associated with idiopathic cardiomyopathy (IDC), premature aging, sporadic motor neuron disease, and cancer. Alternative splicing of this gene results in multiple transcript variants. A related pseudogene has been identified on chromosome 1. | 6648 | ENSG00000112096 | SOD2 | superoxide dismutase 2, mitochondrial |
| The protein encoded by this gene is a mitogen that is secreted by vascular endothelial cells. The encoded protein plays a role in chondrocyte proliferation and differentiation, cell adhesion in many cell types, and is related to platelet-derived growth factor. Certain polymorphisms in this gene have been linked with a higher incidence of systemic sclerosis. | 1490 | ENSG00000118523 | CTGF | connective tissue growth factor |
| This gene encodes a serine protease, which is a major constituent of the human complement subcomponent C1. C1s associates with two other complement components C1r and C1q in order to yield the first component of the serum complement system. Defects in this gene are the cause of selective C1s deficiency. | 716 | ENSG00000182326 | C1S | complement component 1, s subcomponent |
| The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. Mutations in this gene have been associated with familial hypertrophic cardiomyopathy as well as with dilated cardiomyopathy. Transcripts for this gene undergo alternative splicing that results in many tissue-specific isoforms, however, the full-length nature of some of these variants has not yet been determined. | 7139 | ENSG00000118194 | TNNT2 | troponin T2, cardiac type |
| The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | 3860 | ENSG00000171401 | KRT13 | keratin 13 |
| Members of the F-box protein family, such as FBXL16, are characterized by an approximately 40-amino acid F-box motif. SCF complexes, formed by SKP1 (MIM 601434), cullin (see CUL1; MIM 603134), and F-box proteins, act as protein-ubiquitin ligases. F-box proteins interact with SKP1 through the F box, and they interact with ubiquitination targets through other protein interaction domains (Jin et al., 2004 [PubMed 15520277]). | 146330 | ENSG00000127585 | FBXL16 | F-box and leucine rich repeat protein 16 |
| This gene encodes the alpha chain of type XII collagen, a member of the FACIT (fibril-associated collagens with interrupted triple helices) collagen family. Type XII collagen is a homotrimer found in association with type I collagen, an association that is thought to modify the interactions between collagen I fibrils and the surrounding matrix. Alternatively spliced transcript variants encoding different isoforms have been identified. | 1303 | ENSG00000111799 | COL12A1 | collagen type XII alpha 1 chain |
| This gene encodes a PDZ domain-containing protein. PDZ motifs are modular protein-protein interaction domains consisting of 80-120 amino acid residues. PDZ domain-containing proteins interact with each other in cytoskeletal assembly or with other proteins involved in targeting and clustering of membrane proteins. The protein encoded by this gene interacts with alpha-actinin-2 through its N-terminal PDZ domain and with protein kinase C via its C-terminal LIM domains. The LIM domain is a cysteine-rich motif defined by 50-60 amino acids containing two zinc-binding modules. This protein also interacts with all three members of the myozenin family. Mutations in this gene have been associated with myofibrillar myopathy and dilated cardiomyopathy. Alternatively spliced transcript variants encoding different isoforms have been identified; all isoforms have N-terminal PDZ domains while only longer isoforms (1, 2 and 5) have C-terminal LIM domains. | 11155 | ENSG00000122367 | LDB3 | LIM domain binding 3 |
| This gene encodes a calmodulin- and actin-binding protein that plays an essential role in the regulation of smooth muscle and nonmuscle contraction. The conserved domain of this protein possesses the binding activities to Ca(2+)-calmodulin, actin, tropomyosin, myosin, and phospholipids. This protein is a potent inhibitor of the actin-tropomyosin activated myosin MgATPase, and serves as a mediating factor for Ca(2+)-dependent inhibition of smooth muscle contraction. Alternative splicing of this gene results in multiple transcript variants encoding distinct isoforms. | 800 | ENSG00000122786 | CALD1 | caldesmon 1 |
| This gene encodes one of the six subunits of type IV collagen, the major structural component of basement membranes. The C-terminal portion of the protein, known as canstatin, is an inhibitor of angiogenesis and tumor growth. Like the other members of the type IV collagen gene family, this gene is organized in a head-to-head conformation with another type IV collagen gene so that each gene pair shares a common promoter. | 1284 | ENSG00000134871 | COL4A2 | collagen type IV alpha 2 |
| The product of this gene belongs to the integrin alpha chain family. Integrins are heterodimeric integral membrane proteins composed of an alpha subunit and a beta subunit that function in cell surface adhesion and signaling. The encoded preproprotein is proteolytically processed to generate light and heavy chains that comprise the alpha 5 subunit. This subunit associates with the beta 1 subunit to form a fibronectin receptor. This integrin may promote tumor invasion, and higher expression of this gene may be correlated with shorter survival time in lung cancer patients. Note that the integrin alpha 5 and integrin alpha V subunits are encoded by distinct genes. | 3678 | ENSG00000161638 | ITGA5 | integrin subunit alpha 5 |
| The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | 3040 | ENSG00000188536 | HBA2 | hemoglobin subunit alpha 2 |
| NA | 151887 | ENSG00000091986 | CCDC80 | coiled-coil domain containing 80 |
| The obscurin gene spans more than 150 kb, contains over 80 exons and encodes a protein of approximately 720 kDa. The encoded protein contains 68 Ig domains, 2 fibronectin domains, 1 calcium/calmodulin-binding domain, 1 RhoGEF domain with an associated PH domain, and 2 serine-threonine kinase domains. This protein belongs to the family of giant sacromeric signaling proteins that includes titin and nebulin, and may have a role in the organization of myofibrils during assembly and may mediate interactions between the sarcoplasmic reticulum and myofibrils. Alternatively spliced transcript variants encoding different isoforms have been identified. | 84033 | ENSG00000154358 | OBSCN | obscurin, cytoskeletal calmodulin and titin-interacting RhoGEF |
| The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and as a cytokine. Altered expression of this protein is associated with the disease cystic fibrosis. Multiple transcript variants encoding different isoforms have been found for this gene. | 6279 | ENSG00000143546 | S100A8 | S100 calcium binding protein A8 |
| This gene encodes the pro-alpha2 chain of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIB, recessive Ehlers-Danlos syndrome Classical type, idiopathic osteoporosis, and atypical Marfan syndrome. Symptoms associated with mutations in this gene, however, tend to be less severe than mutations in the gene for the alpha1 chain of type I collagen (COL1A1) reflecting the different role of alpha2 chains in matrix integrity. Three transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | 1278 | ENSG00000164692 | COL1A2 | collagen type I alpha 2 chain |
| The protein encoded by this gene is a member of the interleukin 1 cytokine family. This protein inhibits the activities of interleukin 1, alpha (IL1A) and interleukin 1, beta (IL1B), and modulates a variety of interleukin 1 related immune and inflammatory responses. This gene and five other closely related cytokine genes form a gene cluster spanning approximately 400 kb on chromosome 2. A polymorphism of this gene is reported to be associated with increased risk of osteoporotic fractures and gastric cancer. Several alternatively spliced transcript variants encoding distinct isoforms have been reported. | 3557 | ENSG00000136689 | IL1RN | interleukin 1 receptor antagonist |
| The protein encoded by this gene belongs to the TRIM protein family. It has multiple zinc finger motifs and a leucine zipper motif. It has been proposed to form homo- or heterodimers which are involved in nucleic acid binding. Thus, it may act as a transcriptional regulatory factor involved in carcinogenesis and/or differentiation. It may also function in the suppression of radiosensitivity since it is associated with ataxia telangiectasia phenotype. | 23650 | ENSG00000137699 | TRIM29 | tripartite motif containing 29 |
| This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | 7273 | ENSG00000155657 | TTN | titin |
| NA | 7538 | ENSG00000128016 | ZFP36 | ZFP36 ring finger protein |
| The protein encoded by this gene is a transformation and shape-change sensitive actin cross-linking/gelling protein found in fibroblasts and smooth muscle. Its expression is down-regulated in many cell lines, and this down-regulation may be an early and sensitive marker for the onset of transformation. A functional role of this protein is unclear. Two transcript variants encoding the same protein have been found for this gene. | 6876 | ENSG00000149591 | TAGLN | transgelin |
| This gene encodes an RGD-containing protein that binds to type I, II and IV collagens. The RGD motif is found in many extracellular matrix proteins modulating cell adhesion and serves as a ligand recognition sequence for several integrins. This protein plays a role in cell-collagen interactions and may be involved in endochondrial bone formation in cartilage. The protein is induced by transforming growth factor-beta and acts to inhibit cell adhesion. Mutations in this gene are associated with multiple types of corneal dystrophy. | 7045 | ENSG00000120708 | TGFBI | transforming growth factor beta induced |
| This gene encodes the perlecan protein, which consists of a core protein to which three long chains of glycosaminoglycans (heparan sulfate or chondroitin sulfate) are attached. The perlecan protein is a large multidomain proteoglycan that binds to and cross-links many extracellular matrix components and cell-surface molecules. It has been shown that this protein interacts with laminin, prolargin, collagen type IV, FGFBP1, FBLN2, FGF7 and transthyretin, etc., and it plays essential roles in multiple biological activities. Perlecan is a key component of the vascular extracellular matrix, where it helps to maintain the endothelial barrier function. It is a potent inhibitor of smooth muscle cell proliferation and is thus thought to help maintain vascular homeostasis. It can also promote growth factor (e.g., FGF2) activity and thus stimulate endothelial growth and re-generation. It is a major component of basement membranes, where it is involved in the stabilization of other molecules as well as being involved with glomerular permeability to macromolecules and cell adhesion. Mutations in this gene cause Schwartz-Jampel syndrome type 1, Silverman-Handmaker type of dyssegmental dysplasia, and tardive dyskinesia. Alternative splicing of this gene results in multiple transcript variants. | 3339 | ENSG00000142798 | HSPG2 | heparan sulfate proteoglycan 2 |
| C7 is a component of the complement system. It participates in the formation of Membrane Attack Complex (MAC). People with C7 deficiency are prone to bacterial infection. | 730 | ENSG00000112936 | C7 | complement component 7 |
| This gene encodes a serine/threonine protein kinase that localizes to mitochondria. It is thought to protect cells from stress-induced mitochondrial dysfunction. Mutations in this gene cause one form of autosomal recessive early-onset Parkinson disease. | 65018 | ENSG00000158828 | PINK1 | PTEN induced putative kinase 1 |
| This gene encodes a conventional non-muscle myosin; this protein should not be confused with the unconventional myosin-9a or 9b (MYO9A or MYO9B). The encoded protein is a myosin IIA heavy chain that contains an IQ domain and a myosin head-like domain which is involved in several important functions, including cytokinesis, cell motility and maintenance of cell shape. Defects in this gene have been associated with non-syndromic sensorineural deafness autosomal dominant type 17, Epstein syndrome, Alport syndrome with macrothrombocytopenia, Sebastian syndrome, Fechtner syndrome and macrothrombocytopenia with progressive sensorineural deafness. | 4627 | ENSG00000100345 | MYH9 | myosin, heavy chain 9, non-muscle |
| This gene encodes a member of the heat shock protein 70 family, which contains both heat-inducible and constitutively expressed members. This protein belongs to the latter group, which are also referred to as heat-shock cognate proteins. It functions as a chaperone, and binds to nascent polypeptides to facilitate correct folding. It also functions as an ATPase in the disassembly of clathrin-coated vesicles during transport of membrane components through the cell. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 3312 | ENSG00000109971 | HSPA8 | heat shock protein family A (Hsp70) member 8 |
| This gene encodes a motor protein that transports mitochondria and synaptic vesicle precursors. Mutations in this gene cause Charcot-Marie-Tooth disease, type 2A1. | 23095 | ENSG00000054523 | KIF1B | kinesin family member 1B |
| This gene encodes a cytoskeletal protein that is concentrated in areas of cell-substratum and cell-cell contacts. The encoded protein plays a significant role in the assembly of actin filaments and in spreading and migration of various cell types, including fibroblasts and osteoclasts. It codistributes with integrins in the cell surface membrane in order to assist in the attachment of adherent cells to extracellular matrices and of lymphocytes to other cells. The N-terminus of this protein contains elements for localization to cell-extracellular matrix junctions. The C-terminus contains binding sites for proteins such as beta-1-integrin, actin, and vinculin. | 7094 | ENSG00000137076 | TLN1 | talin 1 |
| The protein encoded by this gene is localized to the nucleus of endothelial cells and is induced by IL-1 and TNF-alpha stimulation. Studies in rat cardiomyocytes suggest that this gene functions as a transcription factor. Interactions between this protein and the sarcomeric proteins myopalladin and titin suggest that it may also be involved in the myofibrillar stretch-sensor system. | 27063 | ENSG00000148677 | ANKRD1 | ankyrin repeat domain 1 |
| This gene encodes a member of the phosphatidylethanolamine-binding family of proteins and has been shown to modulate multiple signaling pathways, including the MAP kinase (MAPK), NF-kappa B, and glycogen synthase kinase-3 (GSK-3) signaling pathways. The encoded protein can be further processed to form a smaller cleavage product, hippocampal cholinergic neurostimulating peptide (HCNP), which may be involved in neural development. This gene has been implicated in numerous human cancers and may act as a metastasis suppressor gene. Multiple pseudogenes of this gene have been identified in the genome. | 5037 | ENSG00000089220 | PEBP1 | phosphatidylethanolamine binding protein 1 |
| This gene encodes a member of the myosin-binding protein C family. Myosin-binding protein C family members are myosin-associated proteins found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The encoded protein is the slow skeletal muscle isoform of myosin-binding protein C and plays an important role in muscle contraction by recruiting muscle-type creatine kinase to myosin filaments. Mutations in this gene are associated with distal arthrogryposis type I. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 4604 | ENSG00000196091 | MYBPC1 | myosin binding protein C, slow type |
| This gene is a member of the cytochrome b(561) family that encodes an iron-regulated protein. It highly expressed in the duodenal brush border membrane. It has ferric reductase activity and is believed to play a physiological role in dietary iron absorption. | 79901 | ENSG00000071967 | CYBRD1 | cytochrome b reductase 1 |
| NA | 58498 | ENSG00000106631 | MYL7 | myosin light chain 7 |
| The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in stimulation of Ca2+-dependent insulin release, stimulation of prolactin secretion, and exocytosis. Chromosomal rearrangements and altered expression of this gene have been implicated in melanoma. | 6277 | ENSG00000197956 | S100A6 | S100 calcium binding protein A6 |
| Myosin, a structural component of muscle, consists of two heavy chains and four light chains. The protein encoded by this gene is a myosin light chain that may regulate muscle contraction by modulating the ATPase activity of myosin heads. The encoded protein binds calcium and is activated by myosin light chain kinase. Two transcript variants encoding different isoforms have been found for this gene. | 10398 | ENSG00000101335 | MYL9 | myosin light chain 9 |
| The protein encoded by this gene is a component of desmosomes and of the epidermal cornified envelope in keratinocytes. The N-terminal domain of this protein interacts with the plasma membrane and its C-terminus interacts with intermediate filaments. Through its rod domain, this protein forms complexes with envoplakin. This protein may serve as a link between the cornified envelope and desmosomes as well as intermediate filaments. AKT1/PKB, a protein kinase mediating a variety of cell growth and survival signaling processes, is reported to interact with this protein, suggesting a possible role for this protein as a localization signal in AKT1-mediated signaling. | 5493 | ENSG00000118898 | PPL | periplakin |
| Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | 4625 | ENSG00000092054 | MYH7 | myosin, heavy chain 7, cardiac muscle, beta |
| This gene encodes a member of the Ser/Thr protein kinase family and the TGFB receptor subfamily. The encoded protein is a transmembrane protein that has a protein kinase domain, forms a heterodimeric complex with another receptor protein, and binds TGF-beta. This receptor/ligand complex phosphorylates proteins, which then enter the nucleus and regulate the transcription of a subset of genes related to cell proliferation. Mutations in this gene have been associated with Marfan Syndrome, Loeys-Deitz Aortic Aneurysm Syndrome, and the development of various types of tumors. Alternatively spliced transcript variants encoding different isoforms have been characterized. | 7048 | ENSG00000163513 | TGFBR2 | transforming growth factor beta receptor 2 |
| The scaffolding protein encoded by this gene is the main component of the caveolae plasma membranes found in most cell types. The protein links integrin subunits to the tyrosine kinase FYN, an initiating step in coupling integrins to the Ras-ERK pathway and promoting cell cycle progression. The gene is a tumor suppressor gene candidate and a negative regulator of the Ras-p42/44 mitogen-activated kinase cascade. Caveolin 1 and caveolin 2 are located next to each other on chromosome 7 and express colocalizing proteins that form a stable hetero-oligomeric complex. Mutations in this gene have been associated with Berardinelli-Seip congenital lipodystrophy. Alternatively spliced transcripts encode alpha and beta isoforms of caveolin 1. | 857 | ENSG00000105974 | CAV1 | caveolin 1 |
| This gene is a member of the N-myc downregulated gene family which belongs to the alpha/beta hydrolase superfamily. The protein encoded by this gene is a cytoplasmic protein that is required for cell cycle progression and survival in primary astrocytes and may be involved in the regulation of mitogenic signalling in vascular smooth muscles cells. Alternative splicing results in multiple transcripts encoding different isoforms. | 65009 | ENSG00000103034 | NDRG4 | NDRG family member 4 |
| NA | ENSG00000211896 | ENSG00000211896 | IGHG1 | immunoglobulin heavy constant gamma 1 (G1m marker) |
| This gene encodes a member of the heat shock protein 90 family; these proteins are involved in signal transduction, protein folding and degradation and morphological evolution. This gene encodes the constitutive form of the cytosolic 90 kDa heat-shock protein and is thought to play a role in gastric apoptosis and inflammation. Alternative splicing results in multiple transcript variants. Pseudogenes have been identified on multiple chromosomes. | 3326 | ENSG00000096384 | HSP90AB1 | heat shock protein 90kDa alpha family class B member 1 |
| This gene encodes a member of the fibulin family of extracellular matrix glycoproteins. Like all members of this family, the encoded protein contains tandemly repeated epidermal growth factor-like repeats followed by a C-terminus fibulin-type domain. This gene is upregulated in malignant gliomas and may play a role in the aggressive nature of these tumors. Mutations in this gene are associated with Doyne honeycomb retinal dystrophy. Alternatively spliced transcript variants that encode the same protein have been described. | 2202 | ENSG00000115380 | EFEMP1 | EGF containing fibulin like extracellular matrix protein 1 |
| NA | 5502 | ENSG00000135447 | PPP1R1A | protein phosphatase 1 regulatory inhibitor subunit 1A |
| NA | 79085 | ENSG00000125648 | SLC25A23 | solute carrier family 25 member 23 |
| The protein encoded by this gene is a homeodomain protein that lacks certain conserved residues required for DNA binding. It was reported that choriocarcinoma cell lines and tissues failed to express this gene, which suggested the possible involvement of this gene in malignant conversion of placental trophoblasts. Studies in mice suggest that this protein may interact with serum response factor (SRF) and modulate SRF-dependent cardiac-specific gene expression and cardiac development. Multiple alternatively spliced transcript variants have been identified for this gene. | 84525 | ENSG00000171476 | HOPX | HOP homeobox |
| NA | 6515 | ENSG00000059804 | SLC2A3 | solute carrier family 2 member 3 |
| The protein encoded by this gene is an inducible molecular chaperone that functions as a homodimer. The encoded protein aids in the proper folding of specific target proteins by use of an ATPase activity that is modulated by co-chaperones. Two transcript variants encoding different isoforms have been found for this gene. | 3320 | ENSG00000080824 | HSP90AA1 | heat shock protein 90kDa alpha family class A member 1 |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",12,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[13,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| name | query | symbol | summary | X_id |
|---|---|---|---|---|
| desmin | ENSG00000175084 | DES | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | 1674 |
| hemoglobin subunit beta | ENSG00000244734 | HBB | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | 3043 |
| actinin alpha 2 | ENSG00000077522 | ACTN2 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a muscle-specific, alpha actinin isoform that is expressed in both skeletal and cardiac muscles. Several transcript variants encoding different isoforms have been found for this gene. | 88 |
| keratin 13 | ENSG00000171401 | KRT13 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | 3860 |
| myosin, heavy chain 7, cardiac muscle, beta | ENSG00000092054 | MYH7 | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | 4625 |
| ATPase sarcoplasmic/endoplasmic reticulum Ca2+ transporting 2 | ENSG00000174437 | ATP2A2 | This gene encodes one of the SERCA Ca(2+)-ATPases, which are intracellular pumps located in the sarcoplasmic or endoplasmic reticula of muscle cells. This enzyme catalyzes the hydrolysis of ATP coupled with the translocation of calcium from the cytosol into the sarcoplasmic reticulum lumen, and is involved in regulation of the contraction/relaxation cycle. Mutations in this gene cause Darier-White disease, also known as keratosis follicularis, an autosomal dominant skin disorder characterized by loss of adhesion between epidermal cells and abnormal keratinization. Alternative splicing results in multiple transcript variants encoding different isoforms. | 488 |
| actin, alpha 1, skeletal muscle | ENSG00000143632 | ACTA1 | The product encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Mutations in this gene cause nemaline myopathy type 3, congenital myopathy with excess of thin myofilaments, congenital myopathy with cores, and congenital myopathy with fiber-type disproportion, diseases that lead to muscle fiber defects. | 58 |
| decorin | ENSG00000011465 | DCN | This gene encodes a member of the small leucine-rich proteoglycan family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature protein. This protein plays a role in collagen fibril assembly. Binding of this protein to multiple cell surface receptors mediates its role in tumor suppression, including a stimulatory effect on autophagy and inflammation and an inhibitory effect on angiogenesis and tumorigenesis. This gene and the related gene biglycan are thought to be the result of a gene duplication. Mutations in this gene are associated with congenital stromal corneal dystrophy in human patients. | 1634 |
| titin | ENSG00000155657 | TTN | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | 7273 |
| creatine kinase, M-type | ENSG00000104879 | CKM | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis and is an important serum marker for myocardial infarction. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in striated muscle as well as in other tissues, and as a heterodimer with a similar brain isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. | 1158 |
| filamin C | ENSG00000128591 | FLNC | This gene encodes one of three related filamin genes, specifically gamma filamin. These filamin proteins crosslink actin filaments into orthogonal networks in cortical cytoplasm and participate in the anchoring of membrane proteins for the actin cytoskeleton. Three functional domains exist in filamin: an N-terminal filamentous actin-binding domain, a C-terminal self-association domain, and a membrane glycoprotein-binding domain. Two transcript variants encoding different isoforms have been found for this gene. | 2318 |
| myoglobin | ENSG00000198125 | MB | This gene encodes a member of the globin superfamily and is expressed in skeletal and cardiac muscles. The encoded protein is a haemoprotein contributing to intracellular oxygen storage and transcellular facilitated diffusion of oxygen. At least three alternatively spliced transcript variants encoding the same protein have been reported. | 4151 |
| actin, beta | ENSG00000075624 | ACTB | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | 60 |
| fibronectin 1 | ENSG00000115414 | FN1 | This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | 2335 |
| uncharacterized LOC101927055 | ENSG00000237298 | LOC101927055 | NA | 101927055 |
| TTN antisense RNA 1 | ENSG00000237298 | TTN-AS1 | NA | 100506866 |
| myosin binding protein C, slow type | ENSG00000196091 | MYBPC1 | This gene encodes a member of the myosin-binding protein C family. Myosin-binding protein C family members are myosin-associated proteins found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The encoded protein is the slow skeletal muscle isoform of myosin-binding protein C and plays an important role in muscle contraction by recruiting muscle-type creatine kinase to myosin filaments. Mutations in this gene are associated with distal arthrogryposis type I. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 4604 |
| keratin 4 | ENSG00000170477 | KRT4 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3851 |
| Y-box binding protein 3 | ENSG00000060138 | YBX3 | NA | 8531 |
| aldolase, fructose-bisphosphate A | ENSG00000149925 | ALDOA | The protein encoded by this gene, Aldolase A (fructose-bisphosphate aldolase), is a glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Three aldolase isozymes (A, B, and C), encoded by three different genes, are differentially expressed during development. Aldolase A is found in the developing embryo and is produced in even greater amounts in adult muscle. Aldolase A expression is repressed in adult liver, kidney and intestine and similar to aldolase C levels in brain and other nervous tissue. Aldolase A deficiency has been associated with myopathy and hemolytic anemia. Alternative splicing and alternative promoter usage results in multiple transcript variants. Related pseudogenes have been identified on chromosomes 3 and 10. | 226 |
| enolase 3 | ENSG00000108515 | ENO3 | This gene encodes one of the three enolase isoenzymes found in mammals. This isoenzyme is found in skeletal muscle cells in the adult where it may play a role in muscle development and regeneration. A switch from alpha enolase to beta enolase occurs in muscle tissue during development in rodents. Mutations in this gene have be associated glycogen storage disease. Alternatively spliced transcript variants encoding different isoforms have been described. | 2027 |
| natriuretic peptide A | ENSG00000175206 | NPPA | The protein encoded by this gene belongs to the natriuretic peptide family. Natriuretic peptides are implicated in the control of extracellular fluid volume and electrolyte homeostasis. This protein is synthesized as a large precursor (containing a signal peptide), which is processed to release a peptide from the N-terminus with similarity to vasoactive peptide, cardiodilatin, and another peptide from the C-terminus with natriuretic-diuretic activity. Mutations in this gene have been associated with atrial fibrillation familial type 6. This gene is located adjacent to another member of the natriuretic family of peptides on chromosome 1. | 4878 |
| solute carrier family 25 member 4 | ENSG00000151729 | SLC25A4 | This gene is a member of the mitochondrial carrier subfamily of solute carrier protein genes. The product of this gene functions as a gated pore that translocates ADP from the cytoplasm into the mitochondrial matrix and ATP from the mitochondrial matrix into the cytoplasm. The protein forms a homodimer embedded in the inner mitochondria membrane. Mutations in this gene have been shown to result in autosomal dominant progressive external opthalmoplegia and familial hypertrophic cardiomyopathy. | 291 |
| actin, alpha, cardiac muscle 1 | ENSG00000159251 | ACTC1 | Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC). | 70 |
| myomesin 2 | ENSG00000036448 | MYOM2 | The giant protein titin, together with its associated proteins, interconnects the major structure of sarcomeres, the M bands and Z discs. The C-terminal end of the titin string extends into the M line, where it binds tightly to M-band constituents of apparent molecular masses of 190 kD and 165 kD. The predicted MYOM2 protein contains 1,465 amino acids. Like MYOM1, MYOM2 has a unique N-terminal domain followed by 12 repeat domains with strong homology to either fibronectin type III or immunoglobulin C2 domains. Protein sequence comparisons suggested that the MYOM2 protein and bovine M protein are identical. | 9172 |
| uncharacterized LOC100129518 | ENSG00000112096 | LOC100129518 | NA | 100129518 |
| superoxide dismutase 2, mitochondrial | ENSG00000112096 | SOD2 | This gene is a member of the iron/manganese superoxide dismutase family. It encodes a mitochondrial protein that forms a homotetramer and binds one manganese ion per subunit. This protein binds to the superoxide byproducts of oxidative phosphorylation and converts them to hydrogen peroxide and diatomic oxygen. Mutations in this gene have been associated with idiopathic cardiomyopathy (IDC), premature aging, sporadic motor neuron disease, and cancer. Alternative splicing of this gene results in multiple transcript variants. A related pseudogene has been identified on chromosome 1. | 6648 |
| heat shock protein family B (small) member 7 | ENSG00000173641 | HSPB7 | NA | 27129 |
| LIM domain binding 3 | ENSG00000122367 | LDB3 | This gene encodes a PDZ domain-containing protein. PDZ motifs are modular protein-protein interaction domains consisting of 80-120 amino acid residues. PDZ domain-containing proteins interact with each other in cytoskeletal assembly or with other proteins involved in targeting and clustering of membrane proteins. The protein encoded by this gene interacts with alpha-actinin-2 through its N-terminal PDZ domain and with protein kinase C via its C-terminal LIM domains. The LIM domain is a cysteine-rich motif defined by 50-60 amino acids containing two zinc-binding modules. This protein also interacts with all three members of the myozenin family. Mutations in this gene have been associated with myofibrillar myopathy and dilated cardiomyopathy. Alternatively spliced transcript variants encoding different isoforms have been identified; all isoforms have N-terminal PDZ domains while only longer isoforms (1, 2 and 5) have C-terminal LIM domains. | 11155 |
| phosphorylase, glycogen, muscle | ENSG00000068976 | PYGM | This gene encodes a muscle enzyme involved in glycogenolysis. Highly similar enzymes encoded by different genes are found in liver and brain. Mutations in this gene are associated with McArdle disease (myophosphorylase deficiency), a glycogen storage disease of muscle. Alternative splicing results in multiple transcript variants. | 5837 |
| titin-cap | ENSG00000173991 | TCAP | Sarcomere assembly is regulated by the muscle protein titin. Titin is a giant elastic protein with kinase activity that extends half the length of a sarcomere. It serves as a scaffold to which myofibrils and other muscle related proteins are attached. This gene encodes a protein found in striated and cardiac muscle that binds to the titin Z1-Z2 domains and is a substrate of titin kinase, interactions thought to be critical to sarcomere assembly. Mutations in this gene are associated with limb-girdle muscular dystrophy type 2G. | 8557 |
| small proline rich protein 3 | ENSG00000163209 | SPRR3 | NA | 6707 |
| myosin, heavy chain 10, non-muscle | ENSG00000133026 | MYH10 | This gene encodes a member of the myosin superfamily. The protein represents a conventional non-muscle myosin; it should not be confused with the unconventional myosin-10 (MYO10). Myosins are actin-dependent motor proteins with diverse functions including regulation of cytokinesis, cell motility, and cell polarity. Mutations in this gene have been associated with May-Hegglin anomaly and developmental defects in brain and heart. Multiple transcript variants encoding different isoforms have been found for this gene. | 4628 |
| myomesin 1 | ENSG00000101605 | MYOM1 | The giant protein titin, together with its associated proteins, interconnects the major structure of sarcomeres, the M bands and Z discs. The C-terminal end of the titin string extends into the M line, where it binds tightly to M-band constituents of apparent molecular masses of 190 kD (myomesin 1) and 165 kD (myomesin 2). This protein, myomesin 1, like myomesin 2, titin, and other myofibrillar proteins contains structural modules with strong homology to either fibronectin type III (motif I) or immunoglobulin C2 (motif II) domains. Myomesin 1 and myomesin 2 each have a unique N-terminal region followed by 12 modules of motif I or motif II, in the arrangement II-II-I-I-I-I-I-II-II-II-II-II. The two proteins share 50% sequence identity in this repeat-containing region. The head structure formed by these 2 proteins on one end of the titin string extends into the center of the M band. The integrating structure of the sarcomere arises from muscle-specific members of the superfamily of immunoglobulin-like proteins. Alternatively spliced transcript variants encoding different isoforms have been identified. | 8736 |
| thyroglobulin | ENSG00000042832 | TG | Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. | 7038 |
| troponin T2, cardiac type | ENSG00000118194 | TNNT2 | The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. Mutations in this gene have been associated with familial hypertrophic cardiomyopathy as well as with dilated cardiomyopathy. Transcripts for this gene undergo alternative splicing that results in many tissue-specific isoforms, however, the full-length nature of some of these variants has not yet been determined. | 7139 |
| versican | ENSG00000038427 | VCAN | This gene is a member of the aggrecan/versican proteoglycan family. The protein encoded is a large chondroitin sulfate proteoglycan and is a major component of the extracellular matrix. This protein is involved in cell adhesion, proliferation, proliferation, migration and angiogenesis and plays a central role in tissue morphogenesis and maintenance. Mutations in this gene are the cause of Wagner syndrome type 1. Multiple transcript variants encoding different isoforms have been found for this gene. | 1462 |
| cardiomyopathy associated 5 | ENSG00000164309 | CMYA5 | NA | 202333 |
| pyruvate dehydrogenase kinase 4 | ENSG00000004799 | PDK4 | This gene is a member of the PDK/BCKDK protein kinase family and encodes a mitochondrial protein with a histidine kinase domain. This protein is located in the matrix of the mitrochondria and inhibits the pyruvate dehydrogenase complex by phosphorylating one of its subunits, thereby contributing to the regulation of glucose metabolism. Expression of this gene is regulated by glucocorticoids, retinoic acid and insulin. | 5166 |
| actin binding LIM protein 1 | ENSG00000099204 | ABLIM1 | This gene encodes a cytoskeletal LIM protein that binds to actin filaments via a domain that is homologous to erythrocyte dematin. LIM domains, found in over 60 proteins, play key roles in the regulation of developmental pathways. LIM domains also function as protein-binding interfaces, mediating specific protein-protein interactions. The protein encoded by this gene could mediate such interactions between actin filaments and cytoplasmic targets. Alternatively spliced transcript variants encoding different isoforms have been identified. | 3983 |
| myosin light chain 2 | ENSG00000111245 | MYL2 | Thus gene encodes the regulatory light chain associated with cardiac myosin beta (or slow) heavy chain. Ca+ triggers the phosphorylation of regulatory light chain that in turn triggers contraction. Mutations in this gene are associated with mid-left ventricular chamber type hypertrophic cardiomyopathy. | 4633 |
| myosin light chain 3 | ENSG00000160808 | MYL3 | MYL3 encodes myosin light chain 3, an alkali light chain also referred to in the literature as both the ventricular isoform and the slow skeletal muscle isoform. Mutations in MYL3 have been identified as a cause of mid-left ventricular chamber type hypertrophic cardiomyopathy. | 4634 |
| fatty acid binding protein 3 | ENSG00000121769 | FABP3 | The intracellular fatty acid-binding proteins (FABPs) belongs to a multigene family. FABPs are divided into at least three distinct types, namely the hepatic-, intestinal- and cardiac-type. They form 14-15 kDa proteins and are thought to participate in the uptake, intracellular metabolism and/or transport of long-chain fatty acids. They may also be responsible in the modulation of cell growth and proliferation. Fatty acid-binding protein 3 gene contains four exons and its function is to arrest growth of mammary epithelial cells. This gene is a candidate tumor suppressor gene for human breast cancer. Alternative splicing results in multiple transcript variants. | 2170 |
| NPPA antisense RNA 1 | ENSG00000242349 | NPPA-AS1 | NA | ENSG00000242349 |
| heat shock protein family B (small) member 8 | ENSG00000152137 | HSPB8 | The protein encoded by this gene belongs to the superfamily of small heat-shock proteins containing a conservative alpha-crystallin domain at the C-terminal part of the molecule. The expression of this gene in induced by estrogen in estrogen receptor-positive breast cancer cells, and this protein also functions as a chaperone in association with Bag3, a stimulator of macroautophagy. Thus, this gene appears to be involved in regulation of cell proliferation, apoptosis, and carcinogenesis, and mutations in this gene have been associated with different neuromuscular diseases, including Charcot-Marie-Tooth disease. | 26353 |
| phosphodiesterase 4D interacting protein | ENSG00000178104 | PDE4DIP | The protein encoded by this gene serves to anchor phosphodiesterase 4D to the Golgi/centrosome region of the cell. Defects in this gene may be a cause of myeloproliferative disorder (MBD) associated with eosinophilia. Several transcript variants encoding different isoforms have been found for this gene. | 9659 |
| TIMP metallopeptidase inhibitor 3 | ENSG00000100234 | TIMP3 | This gene belongs to the TIMP gene family. The proteins encoded by this gene family are inhibitors of the matrix metalloproteinases, a group of peptidases involved in degradation of the extracellular matrix (ECM). Expression of this gene is induced in response to mitogenic stimulation and this netrin domain-containing protein is localized to the ECM. Mutations in this gene have been associated with the autosomal dominant disorder Sorsby’s fundus dystrophy. | 7078 |
| heat shock protein 90kDa alpha family class A member 1 | ENSG00000080824 | HSP90AA1 | The protein encoded by this gene is an inducible molecular chaperone that functions as a homodimer. The encoded protein aids in the proper folding of specific target proteins by use of an ATPase activity that is modulated by co-chaperones. Two transcript variants encoding different isoforms have been found for this gene. | 3320 |
| creatine kinase, mitochondrial 2 | ENSG00000131730 | CKMT2 | Mitochondrial creatine kinase (MtCK) is responsible for the transfer of high energy phosphate from mitochondria to the cytosolic carrier, creatine. It belongs to the creatine kinase isoenzyme family. It exists as two isoenzymes, sarcomeric MtCK and ubiquitous MtCK, encoded by separate genes. Mitochondrial creatine kinase occurs in two different oligomeric forms: dimers and octamers, in contrast to the exclusively dimeric cytosolic creatine kinase isoenzymes. Sarcomeric mitochondrial creatine kinase has 80% homology with the coding exons of ubiquitous mitochondrial creatine kinase. This gene contains sequences homologous to several motifs that are shared among some nuclear genes encoding mitochondrial proteins and thus may be essential for the coordinated activation of these genes during mitochondrial biogenesis. Three transcript variants encoding the same protein have been found for this gene. | 1160 |
| actin, alpha 2, smooth muscle, aorta | ENSG00000107796 | ACTA2 | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | 59 |
| matrix metallopeptidase 2 | ENSG00000087245 | MMP2 | This gene is a member of the matrix metalloproteinase (MMP) gene family, that are zinc-dependent enzymes capable of cleaving components of the extracellular matrix and molecules involved in signal transduction. The protein encoded by this gene is a gelatinase A, type IV collagenase, that contains three fibronectin type II repeats in its catalytic site that allow binding of denatured type IV and V collagen and elastin. Unlike most MMP family members, activation of this protein can occur on the cell membrane. This enzyme can be activated extracellularly by proteases, or, intracellulary by its S-glutathiolation with no requirement for proteolytical removal of the pro-domain. This protein is thought to be involved in multiple pathways including roles in the nervous system, endometrial menstrual breakdown, regulation of vascularization, and metastasis. Mutations in this gene have been associated with Winchester syndrome and Nodulosis-Arthropathy-Osteolysis (NAO) syndrome. Alternative splicing results in multiple transcript variants encoding different isoforms. | 4313 |
| keratin 6A | ENSG00000205420 | KRT6A | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. As many as six of this type II cytokeratin (KRT6) have been identified; the multiplicity of the genes is attributed to successive gene duplication events. The genes are expressed with family members KRT16 and/or KRT17 in the filiform papillae of the tongue, the stratified epithelial lining of oral mucosa and esophagus, the outer root sheath of hair follicles, and the glandular epithelia. This KRT6 gene in particular encodes the most abundant isoform. Mutations in these genes have been associated with pachyonychia congenita. In addition, peptides from the C-terminal region of the protein have antimicrobial activity against bacterial pathogens. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3853 |
| colony stimulating factor 3 receptor | ENSG00000119535 | CSF3R | The protein encoded by this gene is the receptor for colony stimulating factor 3, a cytokine that controls the production, differentiation, and function of granulocytes. The encoded protein, which is a member of the family of cytokine receptors, may also function in some cell surface adhesion or recognition processes. Alternatively spliced transcript variants have been described. Mutations in this gene are a cause of Kostmann syndrome, also known as severe congenital neutropenia. | 1441 |
| troponin C1, slow skeletal and cardiac type | ENSG00000114854 | TNNC1 | Troponin is a central regulatory protein of striated muscle contraction, and together with tropomyosin, is located on the actin filament. Troponin consists of 3 subunits: TnI, which is the inhibitor of actomyosin ATPase; TnT, which contains the binding site for tropomyosin; and TnC, the protein encoded by this gene. The binding of calcium to TnC abolishes the inhibitory action of TnI, thus allowing the interaction of actin with myosin, the hydrolysis of ATP, and the generation of tension. Mutations in this gene are associated with cardiomyopathy dilated type 1Z. | 7134 |
| phosphoglucomutase 1 | ENSG00000079739 | PGM1 | The protein encoded by this gene is an isozyme of phosphoglucomutase (PGM) and belongs to the phosphohexose mutase family. There are several PGM isozymes, which are encoded by different genes and catalyze the transfer of phosphate between the 1 and 6 positions of glucose. In most cell types, this PGM isozyme is predominant, representing about 90% of total PGM activity. In red cells, PGM2 is a major isozyme. This gene is highly polymorphic. Mutations in this gene cause glycogen storage disease type 14. Alternativley spliced transcript variants encoding different isoforms have been identified in this gene. | 5236 |
| prostaglandin D2 synthase | ENSG00000107317 | PTGDS | The protein encoded by this gene is a glutathione-independent prostaglandin D synthase that catalyzes the conversion of prostaglandin H2 (PGH2) to postaglandin D2 (PGD2). PGD2 functions as a neuromodulator as well as a trophic factor in the central nervous system. PGD2 is also involved in smooth muscle contraction/relaxation and is a potent inhibitor of platelet aggregation. This gene is preferentially expressed in brain. Studies with transgenic mice overexpressing this gene suggest that this gene may be also involved in the regulation of non-rapid eye movement sleep. | 5730 |
| fatty acid synthase | ENSG00000169710 | FASN | The enzyme encoded by this gene is a multifunctional protein. Its main function is to catalyze the synthesis of palmitate from acetyl-CoA and malonyl-CoA, in the presence of NADPH, into long-chain saturated fatty acids. In some cancer cell lines, this protein has been found to be fused with estrogen receptor-alpha (ER-alpha), in which the N-terminus of FAS is fused in-frame with the C-terminus of ER-alpha. | 2194 |
| complement component 1, s subcomponent | ENSG00000182326 | C1S | This gene encodes a serine protease, which is a major constituent of the human complement subcomponent C1. C1s associates with two other complement components C1r and C1q in order to yield the first component of the serum complement system. Defects in this gene are the cause of selective C1s deficiency. | 716 |
| ankyrin repeat domain 1 | ENSG00000148677 | ANKRD1 | The protein encoded by this gene is localized to the nucleus of endothelial cells and is induced by IL-1 and TNF-alpha stimulation. Studies in rat cardiomyocytes suggest that this gene functions as a transcription factor. Interactions between this protein and the sarcomeric proteins myopalladin and titin suggest that it may also be involved in the myofibrillar stretch-sensor system. | 27063 |
| integrin subunit alpha 8 | ENSG00000077943 | ITGA8 | Integrins are heterodimeric transmembrane receptor proteins that mediate numerous cellular processes including cell adhesion, cytoskeletal rearrangement, and activation of cell signaling pathways. Integrins are composed of alpha and beta subunits. This gene encodes the alpha 8 subunit of the heterodimeric integrin alpha8beta1 protein. The encoded protein is a single-pass type 1 membrane protein that contains multiple FG-GAP repeats. This repeat is predicted to fold into a beta propeller structure. This gene regulates the recruitment of mesenchymal cells into epithelial structures, mediates cell-cell interactions, and regulates neurite outgrowth of sensory and motor neurons. The integrin alpha8beta1 protein thus plays an important role in wound-healing and organogenesis. Mutations in this gene have been associated with renal hypodysplasia/aplasia-1 (RHDA1) and with several animal models of chronic kidney disease. Alternate splicing results in multiple transcript variants encoding distinct isoforms. | 8516 |
| hydroxyacyl-CoA dehydrogenase/3-ketoacyl-CoA thiolase/enoyl-CoA hydratase (trifunctional protein), beta subunit | ENSG00000138029 | HADHB | This gene encodes the beta subunit of the mitochondrial trifunctional protein, which catalyzes the last three steps of mitochondrial beta-oxidation of long chain fatty acids. The mitochondrial membrane-bound heterocomplex is composed of four alpha and four beta subunits, with the beta subunit catalyzing the 3-ketoacyl-CoA thiolase activity. The encoded protein can also bind RNA and decreases the stability of some mRNAs. The genes of the alpha and beta subunits of the mitochondrial trifunctional protein are located adjacent to each other in the human genome in a head-to-head orientation. Mutations in this gene result in trifunctional protein deficiency. Alternatively spliced transcript variants encoding different isoforms have been described. | 3032 |
| tumor protein p53 inducible nuclear protein 2 | ENSG00000078804 | TP53INP2 | NA | 58476 |
| TIMP metallopeptidase inhibitor 2 | ENSG00000035862 | TIMP2 | This gene is a member of the TIMP gene family. The proteins encoded by this gene family are natural inhibitors of the matrix metalloproteinases, a group of peptidases involved in degradation of the extracellular matrix. In addition to an inhibitory role against metalloproteinases, the encoded protein has a unique role among TIMP family members in its ability to directly suppress the proliferation of endothelial cells. As a result, the encoded protein may be critical to the maintenance of tissue homeostasis by suppressing the proliferation of quiescent tissues in response to angiogenic factors, and by inhibiting protease activity in tissues undergoing remodelling of the extracellular matrix. | 7077 |
| peroxiredoxin 6 | ENSG00000117592 | PRDX6 | The protein encoded by this gene is a member of the thiol-specific antioxidant protein family. This protein is a bifunctional enzyme with two distinct active sites. It is involved in redox regulation of the cell; it can reduce H(2)O(2) and short chain organic, fatty acid, and phospholipid hydroperoxides. It may play a role in the regulation of phospholipid turnover as well as in protection against oxidative injury. | 9588 |
| NA | ENSG00000229732 | AC019349.5 | NA | ENSG00000229732 |
| latent transforming growth factor beta binding protein 1 | ENSG00000049323 | LTBP1 | The protein encoded by this gene belongs to the family of latent TGF-beta binding proteins (LTBPs). The secretion and activation of TGF-betas is regulated by their association with latency-associated proteins and with latent TGF-beta binding proteins. The product of this gene targets latent complexes of transforming growth factor beta to the extracellular matrix, where the latent cytokine is subsequently activated by several different mechanisms. Alternatively spliced transcript variants encoding different isoforms have been identified. | 4052 |
| hemoglobin subunit alpha 1 | ENSG00000206172 | HBA1 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | 3039 |
| acetyl-CoA carboxylase beta | ENSG00000076555 | ACACB | Acetyl-CoA carboxylase (ACC) is a complex multifunctional enzyme system. ACC is a biotin-containing enzyme which catalyzes the carboxylation of acetyl-CoA to malonyl-CoA, the rate-limiting step in fatty acid synthesis. ACC-beta is thought to control fatty acid oxidation by means of the ability of malonyl-CoA to inhibit carnitine-palmitoyl-CoA transferase I, the rate-limiting step in fatty acid uptake and oxidation by mitochondria. ACC-beta may be involved in the regulation of fatty acid oxidation, rather than fatty acid biosynthesis. There is evidence for the presence of two ACC-beta isoforms. | 32 |
| solute carrier family 2 member 3 | ENSG00000059804 | SLC2A3 | NA | 6515 |
| myosin binding protein C, cardiac | ENSG00000134571 | MYBPC3 | MYBPC3 encodes the cardiac isoform of myosin-binding protein C. Myosin-binding protein C is a myosin-associated protein found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. MYBPC3, the cardiac isoform, is expressed exclussively in heart muscle. Regulatory phosphorylation of the cardiac isoform in vivo by cAMP-dependent protein kinase (PKA) upon adrenergic stimulation may be linked to modulation of cardiac contraction. Mutations in MYBPC3 are one cause of familial hypertrophic cardiomyopathy. | 4607 |
| nebulin | ENSG00000183091 | NEB | This gene encodes nebulin, a giant protein component of the cytoskeletal matrix that coexists with the thick and thin filaments within the sarcomeres of skeletal muscle. In most vertebrates, nebulin accounts for 3 to 4% of the total myofibrillar protein. The encoded protein contains approximately 30-amino acid long modules that can be classified into 7 types and other repeated modules. Protein isoform sizes vary from 600 to 800 kD due to alternative splicing that is tissue-, species-,and developmental stage-specific. Of the 183 exons in the nebulin gene, at least 43 are alternatively spliced, although exons 143 and 144 are not found in the same transcript. Of the several thousand transcript variants predicted for nebulin, the RefSeq Project has decided to create three representative RefSeq records. Mutations in this gene are associated with recessive nemaline myopathy. | 4703 |
| histidine rich calcium binding protein | ENSG00000130528 | HRC | This gene encodes a luminal sarcoplasmic reticulum protein identified by its ability to bind low-density lipoprotein with high affinity. The protein interacts with the cytoplasmic domain of triadin, the main transmembrane protein of the junctional sarcoplasmic reticulum (SR) of skeletal muscle. The protein functions in the regulation of releasable calcium into the SR. | 3270 |
| caveolin 1 | ENSG00000105974 | CAV1 | The scaffolding protein encoded by this gene is the main component of the caveolae plasma membranes found in most cell types. The protein links integrin subunits to the tyrosine kinase FYN, an initiating step in coupling integrins to the Ras-ERK pathway and promoting cell cycle progression. The gene is a tumor suppressor gene candidate and a negative regulator of the Ras-p42/44 mitogen-activated kinase cascade. Caveolin 1 and caveolin 2 are located next to each other on chromosome 7 and express colocalizing proteins that form a stable hetero-oligomeric complex. Mutations in this gene have been associated with Berardinelli-Seip congenital lipodystrophy. Alternatively spliced transcripts encode alpha and beta isoforms of caveolin 1. | 857 |
| nebulin related anchoring protein | ENSG00000197893 | NRAP | NA | 4892 |
| collagen type VI alpha 3 chain | ENSG00000163359 | COL6A3 | This gene encodes the alpha-3 chain, one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The alpha-3 chain of type VI collagen is much larger than the alpha-1 and -2 chains. This difference in size is largely due to an increase in the number of subdomains, similar to von Willebrand Factor type A domains, that are found in the amino terminal globular domain of all the alpha chains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in the type VI collagen genes are associated with Bethlem myopathy, a rare autosomal dominant proximal myopathy with early childhood onset. Mutations in this gene are also a cause of Ullrich congenital muscular dystrophy, also referred to as Ullrich scleroatonic muscular dystrophy, an autosomal recessive congenital myopathy that is more severe than Bethlem myopathy. Multiple transcript variants have been identified, but the full-length nature of only some of these variants has been described. | 1293 |
| F-box and leucine rich repeat protein 16 | ENSG00000127585 | FBXL16 | Members of the F-box protein family, such as FBXL16, are characterized by an approximately 40-amino acid F-box motif. SCF complexes, formed by SKP1 (MIM 601434), cullin (see CUL1; MIM 603134), and F-box proteins, act as protein-ubiquitin ligases. F-box proteins interact with SKP1 through the F box, and they interact with ubiquitination targets through other protein interaction domains (Jin et al., 2004 [PubMed 15520277]). | 146330 |
| cornulin | ENSG00000143536 | CRNN | This gene encodes a member of the ‘fused gene’ family of proteins, which contain N-terminus EF-hand domains and multiple tandem peptide repeats. The encoded protein contains two EF-hand Ca2+ binding domains in its N-terminus and two glutamine- and threonine-rich 60 amino acid repeats in its C-terminus. This gene, also known as squamous epithelial heat shock protein 53, may play a role in the mucosal/epithelial immune response and epidermal differentiation. | 49860 |
| heat shock protein family A (Hsp70) member 1B | ENSG00000204388 | HSPA1B | This intronless gene encodes a 70kDa heat shock protein which is a member of the heat shock protein 70 family. In conjuction with other heat shock proteins, this protein stabilizes existing proteins against aggregation and mediates the folding of newly translated proteins in the cytosol and in organelles. It is also involved in the ubiquitin-proteasome pathway through interaction with the AU-rich element RNA-binding protein 1. The gene is located in the major histocompatibility complex class III region, in a cluster with two closely related genes which encode similar proteins. | 3304 |
| heparan sulfate proteoglycan 2 | ENSG00000142798 | HSPG2 | This gene encodes the perlecan protein, which consists of a core protein to which three long chains of glycosaminoglycans (heparan sulfate or chondroitin sulfate) are attached. The perlecan protein is a large multidomain proteoglycan that binds to and cross-links many extracellular matrix components and cell-surface molecules. It has been shown that this protein interacts with laminin, prolargin, collagen type IV, FGFBP1, FBLN2, FGF7 and transthyretin, etc., and it plays essential roles in multiple biological activities. Perlecan is a key component of the vascular extracellular matrix, where it helps to maintain the endothelial barrier function. It is a potent inhibitor of smooth muscle cell proliferation and is thus thought to help maintain vascular homeostasis. It can also promote growth factor (e.g., FGF2) activity and thus stimulate endothelial growth and re-generation. It is a major component of basement membranes, where it is involved in the stabilization of other molecules as well as being involved with glomerular permeability to macromolecules and cell adhesion. Mutations in this gene cause Schwartz-Jampel syndrome type 1, Silverman-Handmaker type of dyssegmental dysplasia, and tardive dyskinesia. Alternative splicing of this gene results in multiple transcript variants. | 3339 |
| microtubule associated monooxygenase, calponin and LIM domain containing 1 | ENSG00000135596 | MICAL1 | This gene encodes an enzyme that oxidizes methionine residues on actin, thereby promoting depolymerization of actin filaments. This protein interacts with and regulates signalling by NEDD9/CAS-L (neural precursor cell expressed, developmentally down-regulated 9). Alternative splicing results in multiple transcript variants. | 64780 |
| vimentin | ENSG00000026025 | VIM | This gene encodes a member of the intermediate filament family. Intermediate filamentents, along with microtubules and actin microfilaments, make up the cytoskeleton. The protein encoded by this gene is responsible for maintaining cell shape, integrity of the cytoplasm, and stabilizing cytoskeletal interactions. It is also involved in the immune response, and controls the transport of low-density lipoprotein (LDL)-derived cholesterol from a lysosome to the site of esterification. It functions as an organizer of a number of critical proteins involved in attachment, migration, and cell signaling. Mutations in this gene causes a dominant, pulverulent cataract. | 7431 |
| myosin, heavy chain 2, skeletal muscle, adult | ENSG00000125414 | MYH2 | Myosins are actin-based motor proteins that function in the generation of mechanical force in eukaryotic cells. Muscle myosins are heterohexamers composed of 2 myosin heavy chains and 2 pairs of nonidentical myosin light chains. This gene encodes a member of the class II or conventional myosin heavy chains, and functions in skeletal muscle contraction. This gene is found in a cluster of myosin heavy chain genes on chromosome 17. A mutation in this gene results in inclusion body myopathy-3. Multiple alternatively spliced variants, encoding the same protein, have been identified. | 4620 |
| nicotinamide N-methyltransferase | ENSG00000166741 | NNMT | N-methylation is one method by which drug and other xenobiotic compounds are metabolized by the liver. This gene encodes the protein responsible for this enzymatic activity which uses S-adenosyl methionine as the methyl donor. | 4837 |
| osteoglycin | ENSG00000106809 | OGN | This gene encodes a member of the small leucine-rich proteoglycan (SLRP) family of proteins. The encoded protein induces ectopic bone formation in conjunction with transforming growth factor beta and may regulate osteoblast differentiation. High expression of the encoded protein may be associated with elevated heart left ventricular mass. Alternative splicing results in multiple transcript variants. | 4969 |
| retinol saturase | ENSG00000042445 | RETSAT | NA | 54884 |
| semaphorin 3B | ENSG00000012171 | SEMA3B | The protein encoded by this gene belongs to the class-3 semaphorin/collapsin family, whose members function in growth cone guidance during neuronal development. This family member inhibits axonal extension and has been shown to act as a tumor suppressor by inducing apoptosis. Alternative splicing of this gene results in multiple transcript variants. | 7869 |
| protein phosphatase 1 regulatory inhibitor subunit 14A | ENSG00000167641 | PPP1R14A | The protein encoded by this gene belongs to the protein phosphatase 1 (PP1) inhibitor family. This protein is an inhibitor of smooth muscle myosin phosphatase, and has higher inhibitory activity when phosphorylated. Inhibition of myosin phosphatase leads to increased myosin phosphorylation and enhanced smooth muscle contraction. Alternatively spliced transcript variants encoding different isoforms have been noted for this gene. | 94274 |
| latent transforming growth factor beta binding protein 2 | ENSG00000119681 | LTBP2 | The protein encoded by this gene belongs to the family of latent transforming growth factor (TGF)-beta binding proteins (LTBP), which are extracellular matrix proteins with multi-domain structure. This protein is the largest member of the LTBP family possessing unique regions and with most similarity to the fibrillins. It has thus been suggested that it may have multiple functions: as a member of the TGF-beta latent complex, as a structural component of microfibrils, and a role in cell adhesion. | 4053 |
| Ran GTPase activating protein 1 | ENSG00000100401 | RANGAP1 | This gene encodes a protein that associates with the nuclear pore complex and participates in the regulation of nuclear transport. The encoded protein interacts with Ras-related nuclear protein 1 (RAN) and regulates guanosine triphosphate (GTP)-binding and exchange. Alternative splicing results in multiple transcript variants. | 5905 |
| cystatin B | ENSG00000160213 | CSTB | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins and kininogens. This gene encodes a stefin that functions as an intracellular thiol protease inhibitor. The protein is able to form a dimer stabilized by noncovalent forces, inhibiting papain and cathepsins l, h and b. The protein is thought to play a role in protecting against the proteases leaking from lysosomes. Evidence indicates that mutations in this gene are responsible for the primary defects in patients with progressive myoclonic epilepsy (EPM1). | 1476 |
| DDB1 and CUL4 associated factor 6 | ENSG00000143164 | DCAF6 | NA | 55827 |
| nicotinamide nucleotide transhydrogenase | ENSG00000112992 | NNT | This gene encodes an integral protein of the inner mitochondrial membrane. The enzyme couples hydride transfer between NAD(H) and NADP(+) to proton translocation across the inner mitochondrial membrane. Under most physiological conditions, the enzyme uses energy from the mitochondrial proton gradient to produce high concentrations of NADPH. The resulting NADPH is used for biosynthesis and in free radical detoxification. Two alternatively spliced variants, encoding the same protein, have been found for this gene. | 23530 |
| integrin subunit alpha X | ENSG00000140678 | ITGAX | This gene encodes the integrin alpha X chain protein. Integrins are heterodimeric integral membrane proteins composed of an alpha chain and a beta chain. This protein combines with the beta 2 chain (ITGB2) to form a leukocyte-specific integrin referred to as inactivated-C3b (iC3b) receptor 4 (CR4). The alpha X beta 2 complex seems to overlap the properties of the alpha M beta 2 integrin in the adherence of neutrophils and monocytes to stimulated endothelium cells, and in the phagocytosis of complement coated particles. Two transcript variants encoding different isoforms have been found for this gene. | 3687 |
| complement component 3 | ENSG00000125730 | C3 | Complement component C3 plays a central role in the activation of complement system. Its activation is required for both classical and alternative complement activation pathways. The encoded preproprotein is proteolytically processed to generate alpha and beta subunits that form the mature protein, which is then further processed to generate numerous peptide products. The C3a peptide, also known as the C3a anaphylatoxin, modulates inflammation and possesses antimicrobial activity. Mutations in this gene are associated with atypical hemolytic uremic syndrome and age-related macular degeneration in human patients. | 718 |
| extracellular matrix protein 1 | ENSG00000143369 | ECM1 | This gene encodes a soluble protein that is involved in endochondral bone formation, angiogenesis, and tumor biology. It also interacts with a variety of extracellular and structural proteins, contributing to the maintenance of skin integrity and homeostasis. Mutations in this gene are associated with lipoid proteinosis disorder (also known as hyalinosis cutis et mucosae or Urbach-Wiethe disease) that is characterized by generalized thickening of skin, mucosae and certain viscera. Alternatively spliced transcript variants encoding distinct isoforms have been described for this gene. | 1893 |
| uncharacterized LOC105372824 | ENSG00000160209 | LOC105372824 | NA | 105372824 |
| pyridoxal (pyridoxine, vitamin B6) kinase | ENSG00000160209 | PDXK | The protein encoded by this gene phosphorylates vitamin B6, a step required for the conversion of vitamin B6 to pyridoxal-5-phosphate, an important cofactor in intermediary metabolism. The encoded protein is cytoplasmic and probably acts as a homodimer. Alternatively spliced transcript variants have been described, but their biological validity has not been determined. | 8566 |
| malate dehydrogenase 1 | ENSG00000014641 | MDH1 | This gene encodes an enzyme that catalyzes the NAD/NADH-dependent, reversible oxidation of malate to oxaloacetate in many metabolic pathways, including the citric acid cycle. Two main isozymes are known to exist in eukaryotic cells: one is found in the mitochondrial matrix and the other in the cytoplasm. This gene encodes the cytosolic isozyme, which plays a key role in the malate-aspartate shuttle that allows malate to pass through the mitochondrial membrane to be transformed into oxaloacetate for further cellular processes. Alternatively spliced transcript variants have been found for this gene. A recent study showed that a C-terminally extended isoform is produced by use of an alternative in-frame translation termination codon via a stop codon readthrough mechanism, and that this isoform is localized in the peroxisomes. Pseudogenes have been identified on chromosomes X and 6. | 4190 |
| myosin, heavy chain 9, non-muscle | ENSG00000100345 | MYH9 | This gene encodes a conventional non-muscle myosin; this protein should not be confused with the unconventional myosin-9a or 9b (MYO9A or MYO9B). The encoded protein is a myosin IIA heavy chain that contains an IQ domain and a myosin head-like domain which is involved in several important functions, including cytokinesis, cell motility and maintenance of cell shape. Defects in this gene have been associated with non-syndromic sensorineural deafness autosomal dominant type 17, Epstein syndrome, Alport syndrome with macrothrombocytopenia, Sebastian syndrome, Fechtner syndrome and macrothrombocytopenia with progressive sensorineural deafness. | 4627 |
| zinc finger protein 106 | ENSG00000103994 | ZNF106 | NA | 64397 |
| heat shock protein family A (Hsp70) member 6 | ENSG00000173110 | HSPA6 | NA | 3310 |
| acyl-CoA synthetase long-chain family member 1 | ENSG00000151726 | ACSL1 | The protein encoded by this gene is an isozyme of the long-chain fatty-acid-coenzyme A ligase family. Although differing in substrate specificity, subcellular localization, and tissue distribution, all isozymes of this family convert free long-chain fatty acids into fatty acyl-CoA esters, and thereby play a key role in lipid biosynthesis and fatty acid degradation. Several transcript variants encoding different isoforms have been found for this gene. | 2180 |
| phosphatidylinositol-3,4,5-trisphosphate dependent Rac exchange factor 1 | ENSG00000124126 | PREX1 | The protein encoded by this gene acts as a guanine nucleotide exchange factor for the RHO family of small GTP-binding proteins (RACs). It has been shown to bind to and activate RAC1 by exchanging bound GDP for free GTP. The encoded protein, which is found mainly in the cytoplasm, is activated by phosphatidylinositol-3,4,5-trisphosphate and the beta-gamma subunits of heterotrimeric G proteins. | 57580 |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",13,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[14,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | X_id | summary | query | name | notfound |
|---|---|---|---|---|---|
| NPPA | 4878 | The protein encoded by this gene belongs to the natriuretic peptide family. Natriuretic peptides are implicated in the control of extracellular fluid volume and electrolyte homeostasis. This protein is synthesized as a large precursor (containing a signal peptide), which is processed to release a peptide from the N-terminus with similarity to vasoactive peptide, cardiodilatin, and another peptide from the C-terminus with natriuretic-diuretic activity. Mutations in this gene have been associated with atrial fibrillation familial type 6. This gene is located adjacent to another member of the natriuretic family of peptides on chromosome 1. | ENSG00000175206 | natriuretic peptide A | NA |
| MYH6 | 4624 | Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. | ENSG00000197616 | myosin, heavy chain 6, cardiac muscle, alpha | NA |
| ACTC1 | 70 | Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC). | ENSG00000159251 | actin, alpha, cardiac muscle 1 | NA |
| NPPA-AS1 | ENSG00000242349 | NA | ENSG00000242349 | NPPA antisense RNA 1 | NA |
| ANKRD1 | 27063 | The protein encoded by this gene is localized to the nucleus of endothelial cells and is induced by IL-1 and TNF-alpha stimulation. Studies in rat cardiomyocytes suggest that this gene functions as a transcription factor. Interactions between this protein and the sarcomeric proteins myopalladin and titin suggest that it may also be involved in the myofibrillar stretch-sensor system. | ENSG00000148677 | ankyrin repeat domain 1 | NA |
| FN1 | 2335 | This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | ENSG00000115414 | fibronectin 1 | NA |
| PAM | 5066 | This gene encodes a multifunctional protein. The encoded preproprotein is proteolytically processed to generate the mature enzyme. This enzyme includes two domains with distinct catalytic activities, a peptidylglycine alpha-hydroxylating monooxygenase (PHM) domain and a peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) domain. These catalytic domains work sequentially to catalyze the conversion of neuroendocrine peptides to active alpha-amidated products. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that is proteolytically processed. | ENSG00000145730 | peptidylglycine alpha-amidating monooxygenase | NA |
| MB | 4151 | This gene encodes a member of the globin superfamily and is expressed in skeletal and cardiac muscles. The encoded protein is a haemoprotein contributing to intracellular oxygen storage and transcellular facilitated diffusion of oxygen. At least three alternatively spliced transcript variants encoding the same protein have been reported. | ENSG00000198125 | myoglobin | NA |
| COL1A1 | 1277 | This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIA, Ehlers-Danlos syndrome Classical type, Caffey Disease and idiopathic osteoporosis. Reciprocal translocations between chromosomes 17 and 22, where this gene and the gene for platelet-derived growth factor beta are located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans, resulting from unregulated expression of the growth factor. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | ENSG00000108821 | collagen type I alpha 1 | NA |
| MYH7 | 4625 | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | ENSG00000092054 | myosin, heavy chain 7, cardiac muscle, beta | NA |
| DKK3 | 27122 | This gene encodes a protein that is a member of the dickkopf family. The secreted protein contains two cysteine rich regions and is involved in embryonic development through its interactions with the Wnt signaling pathway. The expression of this gene is decreased in a variety of cancer cell lines and it may function as a tumor suppressor gene. Alternative splicing results in multiple transcript variants encoding the same protein. | ENSG00000050165 | dickkopf WNT signaling pathway inhibitor 3 | NA |
| MYL7 | 58498 | NA | ENSG00000106631 | myosin light chain 7 | NA |
| SYNPO | 11346 | Synaptopodin is an actin-associated protein that may play a role in actin-based cell shape and motility. The name synaptopodin derives from the protein’s associations with postsynaptic densities and dendritic spines and with renal podocytes (Mundel et al., 1997 [PubMed 9314539]). | ENSG00000171992 | synaptopodin | NA |
| ATP2A2 | 488 | This gene encodes one of the SERCA Ca(2+)-ATPases, which are intracellular pumps located in the sarcoplasmic or endoplasmic reticula of muscle cells. This enzyme catalyzes the hydrolysis of ATP coupled with the translocation of calcium from the cytosol into the sarcoplasmic reticulum lumen, and is involved in regulation of the contraction/relaxation cycle. Mutations in this gene cause Darier-White disease, also known as keratosis follicularis, an autosomal dominant skin disorder characterized by loss of adhesion between epidermal cells and abnormal keratinization. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000174437 | ATPase sarcoplasmic/endoplasmic reticulum Ca2+ transporting 2 | NA |
| HSPB7 | 27129 | NA | ENSG00000173641 | heat shock protein family B (small) member 7 | NA |
| FTL | 2512 | This gene encodes the light subunit of the ferritin protein. Ferritin is the major intracellular iron storage protein in prokaryotes and eukaryotes. It is composed of 24 subunits of the heavy and light ferritin chains. Variation in ferritin subunit composition may affect the rates of iron uptake and release in different tissues. A major function of ferritin is the storage of iron in a soluble and nontoxic state. Defects in this light chain ferritin gene are associated with several neurodegenerative diseases and hyperferritinemia-cataract syndrome. This gene has multiple pseudogenes. | ENSG00000087086 | ferritin, light polypeptide | NA |
| FABP3 | 2170 | The intracellular fatty acid-binding proteins (FABPs) belongs to a multigene family. FABPs are divided into at least three distinct types, namely the hepatic-, intestinal- and cardiac-type. They form 14-15 kDa proteins and are thought to participate in the uptake, intracellular metabolism and/or transport of long-chain fatty acids. They may also be responsible in the modulation of cell growth and proliferation. Fatty acid-binding protein 3 gene contains four exons and its function is to arrest growth of mammary epithelial cells. This gene is a candidate tumor suppressor gene for human breast cancer. Alternative splicing results in multiple transcript variants. | ENSG00000121769 | fatty acid binding protein 3 | NA |
| ACTN2 | 88 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a muscle-specific, alpha actinin isoform that is expressed in both skeletal and cardiac muscles. Several transcript variants encoding different isoforms have been found for this gene. | ENSG00000077522 | actinin alpha 2 | NA |
| COL6A3 | 1293 | This gene encodes the alpha-3 chain, one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The alpha-3 chain of type VI collagen is much larger than the alpha-1 and -2 chains. This difference in size is largely due to an increase in the number of subdomains, similar to von Willebrand Factor type A domains, that are found in the amino terminal globular domain of all the alpha chains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in the type VI collagen genes are associated with Bethlem myopathy, a rare autosomal dominant proximal myopathy with early childhood onset. Mutations in this gene are also a cause of Ullrich congenital muscular dystrophy, also referred to as Ullrich scleroatonic muscular dystrophy, an autosomal recessive congenital myopathy that is more severe than Bethlem myopathy. Multiple transcript variants have been identified, but the full-length nature of only some of these variants has been described. | ENSG00000163359 | collagen type VI alpha 3 chain | NA |
| DSTN | 11034 | The product of this gene belongs to the actin-binding proteins ADF family. This family of proteins is responsible for enhancing the turnover rate of actin in vivo. This gene encodes the actin depolymerizing protein that severs actin filaments (F-actin) and binds to actin monomers (G-actin). Two transcript variants encoding distinct isoforms have been identified for this gene. | ENSG00000125868 | destrin, actin depolymerizing factor | NA |
| TCAP | 8557 | Sarcomere assembly is regulated by the muscle protein titin. Titin is a giant elastic protein with kinase activity that extends half the length of a sarcomere. It serves as a scaffold to which myofibrils and other muscle related proteins are attached. This gene encodes a protein found in striated and cardiac muscle that binds to the titin Z1-Z2 domains and is a substrate of titin kinase, interactions thought to be critical to sarcomere assembly. Mutations in this gene are associated with limb-girdle muscular dystrophy type 2G. | ENSG00000173991 | titin-cap | NA |
| SERPINE1 | 5054 | This gene encodes a member of the serine proteinase inhibitor (serpin) superfamily. This member is the principal inhibitor of tissue plasminogen activator (tPA) and urokinase (uPA), and hence is an inhibitor of fibrinolysis. Defects in this gene are the cause of plasminogen activator inhibitor-1 deficiency (PAI-1 deficiency), and high concentrations of the gene product are associated with thrombophilia. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ENSG00000106366 | serpin family E member 1 | NA |
| MYBPC3 | 4607 | MYBPC3 encodes the cardiac isoform of myosin-binding protein C. Myosin-binding protein C is a myosin-associated protein found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. MYBPC3, the cardiac isoform, is expressed exclussively in heart muscle. Regulatory phosphorylation of the cardiac isoform in vivo by cAMP-dependent protein kinase (PKA) upon adrenergic stimulation may be linked to modulation of cardiac contraction. Mutations in MYBPC3 are one cause of familial hypertrophic cardiomyopathy. | ENSG00000134571 | myosin binding protein C, cardiac | NA |
| TTN | 7273 | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | ENSG00000155657 | titin | NA |
| COL6A2 | 1292 | This gene encodes one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The product of this gene contains several domains similar to von Willebrand Factor type A domains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in this gene are associated with Bethlem myopathy and Ullrich scleroatonic muscular dystrophy. Three transcript variants have been identified for this gene. | ENSG00000142173 | collagen type VI alpha 2 | NA |
| MYL12A | 10627 | This gene encodes a nonsarcomeric myosin regulatory light chain. This protein is activated by phosphorylation and regulates smooth muscle and non-muscle cell contraction. This protein may also be involved in DNA damage repair by sequestering the transcriptional regulator apoptosis-antagonizing transcription factor (AATF)/Che-1 which functions as a repressor of p53-driven apoptosis. Alternate splicing results in multiple transcript variants. A pseudogene of this gene is found on chromosome 8. | ENSG00000101608 | myosin light chain 12A | NA |
| PTX3 | 5806 | NA | ENSG00000163661 | pentraxin 3 | NA |
| ACTA2 | 59 | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | ENSG00000107796 | actin, alpha 2, smooth muscle, aorta | NA |
| CRIP2 | 1397 | This gene encodes a putative transcription factor with two LIM zinc-binding domains. The encoded protein may participate in the differentiation of smooth muscle tissue. Alternative splicing results in multiple transcript variants. | ENSG00000182809 | cysteine rich protein 2 | NA |
| FSTL1 | 11167 | This gene encodes a protein with similarity to follistatin, an activin-binding protein. It contains an FS module, a follistatin-like sequence containing 10 conserved cysteine residues. This gene product is thought to be an autoantigen associated with rheumatoid arthritis. | ENSG00000163430 | follistatin like 1 | NA |
| TNNC1 | 7134 | Troponin is a central regulatory protein of striated muscle contraction, and together with tropomyosin, is located on the actin filament. Troponin consists of 3 subunits: TnI, which is the inhibitor of actomyosin ATPase; TnT, which contains the binding site for tropomyosin; and TnC, the protein encoded by this gene. The binding of calcium to TnC abolishes the inhibitory action of TnI, thus allowing the interaction of actin with myosin, the hydrolysis of ATP, and the generation of tension. Mutations in this gene are associated with cardiomyopathy dilated type 1Z. | ENSG00000114854 | troponin C1, slow skeletal and cardiac type | NA |
| MYH10 | 4628 | This gene encodes a member of the myosin superfamily. The protein represents a conventional non-muscle myosin; it should not be confused with the unconventional myosin-10 (MYO10). Myosins are actin-dependent motor proteins with diverse functions including regulation of cytokinesis, cell motility, and cell polarity. Mutations in this gene have been associated with May-Hegglin anomaly and developmental defects in brain and heart. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000133026 | myosin, heavy chain 10, non-muscle | NA |
| EFEMP1 | 2202 | This gene encodes a member of the fibulin family of extracellular matrix glycoproteins. Like all members of this family, the encoded protein contains tandemly repeated epidermal growth factor-like repeats followed by a C-terminus fibulin-type domain. This gene is upregulated in malignant gliomas and may play a role in the aggressive nature of these tumors. Mutations in this gene are associated with Doyne honeycomb retinal dystrophy. Alternatively spliced transcript variants that encode the same protein have been described. | ENSG00000115380 | EGF containing fibulin like extracellular matrix protein 1 | NA |
| THBS1 | 7057 | The protein encoded by this gene is a subunit of a disulfide-linked homotrimeric protein. This protein is an adhesive glycoprotein that mediates cell-to-cell and cell-to-matrix interactions. This protein can bind to fibrinogen, fibronectin, laminin, type V collagen and integrins alpha-V/beta-1. This protein has been shown to play roles in platelet aggregation, angiogenesis, and tumorigenesis. | ENSG00000137801 | thrombospondin 1 | NA |
| FBN1 | 2200 | This gene encodes a member of the fibrillin family of proteins. The encoded preproprotein is proteolytically processed to generate two proteins including the extracellular matrix component fibrillin-1 and the protein hormone asprosin. Fibrillin-1 is an extracellular matrix glycoprotein that serves as a structural component of calcium-binding microfibrils. These microfibrils provide force-bearing structural support in elastic and nonelastic connective tissue throughout the body. Asprosin, secreted by white adipose tissue, has been shown to regulate glucose homeostasis. Mutations in this gene are associated with Marfan syndrome and the related MASS phenotype, as well as ectopia lentis syndrome, Weill-Marchesani syndrome, Shprintzen-Goldberg syndrome and neonatal progeroid syndrome. | ENSG00000166147 | fibrillin 1 | NA |
| FBLN5 | 10516 | The protein encoded by this gene is a secreted, extracellular matrix protein containing an Arg-Gly-Asp (RGD) motif and calcium-binding EGF-like domains. It promotes adhesion of endothelial cells through interaction of integrins and the RGD motif. It is prominently expressed in developing arteries but less so in adult vessels. However, its expression is reinduced in balloon-injured vessels and atherosclerotic lesions, notably in intimal vascular smooth muscle cells and endothelial cells. Therefore, the protein encoded by this gene may play a role in vascular development and remodeling. Defects in this gene are a cause of autosomal dominant cutis laxa, autosomal recessive cutis laxa type I (CL type I), and age-related macular degeneration type 3 (ARMD3). | ENSG00000140092 | fibulin 5 | NA |
| MT2A | 4502 | NA | ENSG00000125148 | metallothionein 2A | NA |
| MYL9 | 10398 | Myosin, a structural component of muscle, consists of two heavy chains and four light chains. The protein encoded by this gene is a myosin light chain that may regulate muscle contraction by modulating the ATPase activity of myosin heads. The encoded protein binds calcium and is activated by myosin light chain kinase. Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000101335 | myosin light chain 9 | NA |
| CCDC80 | 151887 | NA | ENSG00000091986 | coiled-coil domain containing 80 | NA |
| ALDOA | 226 | The protein encoded by this gene, Aldolase A (fructose-bisphosphate aldolase), is a glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Three aldolase isozymes (A, B, and C), encoded by three different genes, are differentially expressed during development. Aldolase A is found in the developing embryo and is produced in even greater amounts in adult muscle. Aldolase A expression is repressed in adult liver, kidney and intestine and similar to aldolase C levels in brain and other nervous tissue. Aldolase A deficiency has been associated with myopathy and hemolytic anemia. Alternative splicing and alternative promoter usage results in multiple transcript variants. Related pseudogenes have been identified on chromosomes 3 and 10. | ENSG00000149925 | aldolase, fructose-bisphosphate A | NA |
| CTSB | 1508 | This gene encodes a member of the C1 family of peptidases. Alternative splicing of this gene results in multiple transcript variants. At least one of these variants encodes a preproprotein that is proteolytically processed to generate multiple protein products. These products include the cathepsin B light and heavy chains, which can dimerize to form the double chain form of the enzyme. This enzyme is a lysosomal cysteine protease with both endopeptidase and exopeptidase activity that may play a role in protein turnover. It is also known as amyloid precursor protein secretase and is involved in the proteolytic processing of amyloid precursor protein (APP). Incomplete proteolytic processing of APP has been suggested to be a causative factor in Alzheimer’s disease, the most common cause of dementia. Overexpression of the encoded protein has been associated with esophageal adenocarcinoma and other tumors. Multiple pseudogenes of this gene have been identified. | ENSG00000164733 | cathepsin B | NA |
| NA | NA | NA | ENSG00000259716 | NA | TRUE |
| COL27A1 | 85301 | This gene encodes a member of the fibrillar collagen family, and plays a role during the calcification of cartilage and the transition of cartilage to bone. The encoded protein product is a preproprotein. It includes an N-terminal signal peptide, which is followed by an N-terminal propetide, mature peptide and a C-terminal propeptide. The N-terminal propeptide contains thrombospondin N-terminal-like and laminin G-like domains. The mature peptide is a major triple-helical region. The C-terminal propeptide, also known as COLFI domain, plays crucial roles in tissue growth and repair. Mutations in this gene cause Steel syndrome. Alternatively spliced transcript variants have been found, but the full-length nature of some variants has not been determined. | ENSG00000196739 | collagen type XXVII alpha 1 | NA |
| FNBP1 | 23048 | The protein encoded by this gene is a member of the formin-binding-protein family. The protein contains an N-terminal Fer/Cdc42-interacting protein 4 (CIP4) homology (FCH) domain followed by a coiled-coil domain, a proline-rich motif, a second coiled-coil domain, a Rho family protein-binding domain (RBD), and a C-terminal SH3 domain. This protein binds sorting nexin 2 (SNX2), tankyrase (TNKS), and dynamin; an interaction between this protein and formin has not been demonstrated yet in human. | ENSG00000187239 | formin binding protein 1 | NA |
| TPM1 | 7168 | This gene is a member of the tropomyosin family of highly conserved, widely distributed actin-binding proteins involved in the contractile system of striated and smooth muscles and the cytoskeleton of non-muscle cells. Tropomyosin is composed of two alpha-helical chains arranged as a coiled-coil. It is polymerized end to end along the two grooves of actin filaments and provides stability to the filaments. The encoded protein is one type of alpha helical chain that forms the predominant tropomyosin of striated muscle, where it also functions in association with the troponin complex to regulate the calcium-dependent interaction of actin and myosin during muscle contraction. In smooth muscle and non-muscle cells, alternatively spliced transcript variants encoding a range of isoforms have been described. Mutations in this gene are associated with type 3 familial hypertrophic cardiomyopathy. | ENSG00000140416 | tropomyosin 1 (alpha) | NA |
| TGFBI | 7045 | This gene encodes an RGD-containing protein that binds to type I, II and IV collagens. The RGD motif is found in many extracellular matrix proteins modulating cell adhesion and serves as a ligand recognition sequence for several integrins. This protein plays a role in cell-collagen interactions and may be involved in endochondrial bone formation in cartilage. The protein is induced by transforming growth factor-beta and acts to inhibit cell adhesion. Mutations in this gene are associated with multiple types of corneal dystrophy. | ENSG00000120708 | transforming growth factor beta induced | NA |
| MYOM2 | 9172 | The giant protein titin, together with its associated proteins, interconnects the major structure of sarcomeres, the M bands and Z discs. The C-terminal end of the titin string extends into the M line, where it binds tightly to M-band constituents of apparent molecular masses of 190 kD and 165 kD. The predicted MYOM2 protein contains 1,465 amino acids. Like MYOM1, MYOM2 has a unique N-terminal domain followed by 12 repeat domains with strong homology to either fibronectin type III or immunoglobulin C2 domains. Protein sequence comparisons suggested that the MYOM2 protein and bovine M protein are identical. | ENSG00000036448 | myomesin 2 | NA |
| GLUL | 2752 | The protein encoded by this gene belongs to the glutamine synthetase family. It catalyzes the synthesis of glutamine from glutamate and ammonia in an ATP-dependent reaction. This protein plays a role in ammonia and glutamate detoxification, acid-base homeostasis, cell signaling, and cell proliferation. Glutamine is an abundant amino acid, and is important to the biosynthesis of several amino acids, pyrimidines, and purines. Mutations in this gene are associated with congenital glutamine deficiency, and overexpression of this gene was observed in some primary liver cancer samples. There are six pseudogenes of this gene found on chromosomes 2, 5, 9, 11, and 12. Alternative splicing results in multiple transcript variants. | ENSG00000135821 | glutamate-ammonia ligase | NA |
| FABP4 | 2167 | FABP4 encodes the fatty acid binding protein found in adipocytes. Fatty acid binding proteins are a family of small, highly conserved, cytoplasmic proteins that bind long-chain fatty acids and other hydrophobic ligands. It is thought that FABPs roles include fatty acid uptake, transport, and metabolism. | ENSG00000170323 | fatty acid binding protein 4 | NA |
| FLNB | 2317 | This gene encodes a member of the filamin family. The encoded protein interacts with glycoprotein Ib alpha as part of the process to repair vascular injuries. The platelet glycoprotein Ib complex includes glycoprotein Ib alpha, and it binds the actin cytoskeleton. Mutations in this gene have been found in several conditions: atelosteogenesis type 1 and type 3; boomerang dysplasia; autosomal dominant Larsen syndrome; and spondylocarpotarsal synostosis syndrome. Multiple alternatively spliced transcript variants that encode different protein isoforms have been described for this gene. | ENSG00000136068 | filamin B | NA |
| NPPB | 4879 | This gene is a member of the natriuretic peptide family and encodes a secreted protein which functions as a cardiac hormone. The protein undergoes two cleavage events, one within the cell and a second after secretion into the blood. The protein’s biological actions include natriuresis, diuresis, vasorelaxation, inhibition of renin and aldosterone secretion, and a key role in cardiovascular homeostasis. A high concentration of this protein in the bloodstream is indicative of heart failure. The protein also acts as an antimicrobial peptide with antibacterial and antifungal activity. Mutations in this gene have been associated with postmenopausal osteoporosis. | ENSG00000120937 | natriuretic peptide B | NA |
| MTND2P28 | ENSG00000225630 | NA | ENSG00000225630 | mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 2 pseudogene 28 | NA |
| ACTA1 | 58 | The product encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Mutations in this gene cause nemaline myopathy type 3, congenital myopathy with excess of thin myofilaments, congenital myopathy with cores, and congenital myopathy with fiber-type disproportion, diseases that lead to muscle fiber defects. | ENSG00000143632 | actin, alpha 1, skeletal muscle | NA |
| ACTG2 | 72 | Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | ENSG00000163017 | actin, gamma 2, smooth muscle, enteric | NA |
| GOLGA8A | 23015 | The Golgi apparatus, which participates in glycosylation and transport of proteins and lipids in the secretory pathway, consists of a series of stacked, flattened membrane sacs referred to as cisternae. Interactions between the Golgi and microtubules are thought to be important for the reorganization of the Golgi after it fragments during mitosis. The golgins constitute a family of proteins which are localized to the Golgi. This gene encodes a golgin which structurally resembles its family member GOLGA2, suggesting that they may share a similar function. There are many similar copies of this gene on chromosome 15. Alternative splicing results in multiple transcript variants. | ENSG00000175265 | golgin A8 family member A | NA |
| DCN | 1634 | This gene encodes a member of the small leucine-rich proteoglycan family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature protein. This protein plays a role in collagen fibril assembly. Binding of this protein to multiple cell surface receptors mediates its role in tumor suppression, including a stimulatory effect on autophagy and inflammation and an inhibitory effect on angiogenesis and tumorigenesis. This gene and the related gene biglycan are thought to be the result of a gene duplication. Mutations in this gene are associated with congenital stromal corneal dystrophy in human patients. | ENSG00000011465 | decorin | NA |
| TGM2 | 7052 | Transglutaminases are enzymes that catalyze the crosslinking of proteins by epsilon-gamma glutamyl lysine isopeptide bonds. While the primary structure of transglutaminases is not conserved, they all have the same amino acid sequence at their active sites and their activity is calcium-dependent. The protein encoded by this gene acts as a monomer, is induced by retinoic acid, and appears to be involved in apoptosis. Finally, the encoded protein is the autoantigen implicated in celiac disease. Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000198959 | transglutaminase 2 | NA |
| NNMT | 4837 | N-methylation is one method by which drug and other xenobiotic compounds are metabolized by the liver. This gene encodes the protein responsible for this enzymatic activity which uses S-adenosyl methionine as the methyl donor. | ENSG00000166741 | nicotinamide N-methyltransferase | NA |
| CASQ2 | 845 | The protein encoded by this gene specifies the cardiac muscle family member of the calsequestrin family. Calsequestrin is localized to the sarcoplasmic reticulum in cardiac and slow skeletal muscle cells. The protein is a calcium binding protein that stores calcium for muscle function. Mutations in this gene cause stress-induced polymorphic ventricular tachycardia, also referred to as catecholaminergic polymorphic ventricular tachycardia 2 (CPVT2), a disease characterized by bidirectional ventricular tachycardia that may lead to cardiac arrest. | ENSG00000118729 | calsequestrin 2 | NA |
| MGP | 4256 | The protein encoded by this gene is secreted and likely acts as an inhibitor of bone formation. The encoded protein is found in the organic matrix of bone and cartilage. Defects in this gene are a cause of Keutel syndrome (KS). Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000111341 | matrix Gla protein | NA |
| AEBP1 | 165 | This gene encodes a member of carboxypeptidase A protein family. The encoded protein may function as a transcriptional repressor and play a role in adipogenesis and smooth muscle cell differentiation. Studies in mice suggest that this gene functions in wound healing and abdominal wall development. Overexpression of this gene is associated with glioblastoma. | ENSG00000106624 | AE binding protein 1 | NA |
| MFGE8 | 4240 | This gene encodes a preproprotein that is proteolytically processed to form multiple protein products. The major encoded protein product, lactadherin, is a membrane glycoprotein that promotes phagocytosis of apoptotic cells. This protein has also been implicated in wound healing, autoimmune disease, and cancer. Lactadherin can be further processed to form a smaller cleavage product, medin, which comprises the major protein component of aortic medial amyloid (AMA). Alternative splicing results in multiple transcript variants. | ENSG00000140545 | milk fat globule-EGF factor 8 protein | NA |
| KRT4 | 3851 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | ENSG00000170477 | keratin 4 | NA |
| LTBP1 | 4052 | The protein encoded by this gene belongs to the family of latent TGF-beta binding proteins (LTBPs). The secretion and activation of TGF-betas is regulated by their association with latency-associated proteins and with latent TGF-beta binding proteins. The product of this gene targets latent complexes of transforming growth factor beta to the extracellular matrix, where the latent cytokine is subsequently activated by several different mechanisms. Alternatively spliced transcript variants encoding different isoforms have been identified. | ENSG00000049323 | latent transforming growth factor beta binding protein 1 | NA |
| NDUFA4 | 4697 | The protein encoded by this gene belongs to the complex I 9kDa subunit family. Mammalian complex I of mitochondrial respiratory chain is composed of 45 different subunits. This protein has NADH dehydrogenase activity and oxidoreductase activity. It transfers electrons from NADH to the respiratory chain. The immediate electron acceptor for the enzyme is believed to be ubiquinone. | ENSG00000189043 | NDUFA4, mitochondrial complex associated | NA |
| RGS5 | 8490 | This gene encodes a member of the regulators of G protein signaling (RGS) family. The RGS proteins are signal transduction molecules which are involved in the regulation of heterotrimeric G proteins by acting as GTPase activators. This gene is a hypoxia-inducible factor-1 dependent, hypoxia-induced gene which is involved in the induction of endothelial apoptosis. This gene is also one of three genes on chromosome 1q contributing to elevated blood pressure. Alternatively spliced transcript variants have been identified. | ENSG00000143248 | regulator of G-protein signaling 5 | NA |
| MYL3 | 4634 | MYL3 encodes myosin light chain 3, an alkali light chain also referred to in the literature as both the ventricular isoform and the slow skeletal muscle isoform. Mutations in MYL3 have been identified as a cause of mid-left ventricular chamber type hypertrophic cardiomyopathy. | ENSG00000160808 | myosin light chain 3 | NA |
| PDE4DIP | 9659 | The protein encoded by this gene serves to anchor phosphodiesterase 4D to the Golgi/centrosome region of the cell. Defects in this gene may be a cause of myeloproliferative disorder (MBD) associated with eosinophilia. Several transcript variants encoding different isoforms have been found for this gene. | ENSG00000178104 | phosphodiesterase 4D interacting protein | NA |
| LRP1 | 4035 | This gene encodes a member of the low-density lipoprotein receptor family of proteins. The encoded preproprotein is proteolytically processed by furin to generate 515 kDa and 85 kDa subunits that form the mature receptor (PMID: 8546712). This receptor is involved in several cellular processes, including intracellular signaling, lipid homeostasis, and clearance of apoptotic cells. In addition, the encoded protein is necessary for the alpha 2-macroglobulin-mediated clearance of secreted amyloid precursor protein and beta-amyloid, the main component of amyloid plaques found in Alzheimer patients. Expression of this gene decreases with age and has been found to be lower than controls in brain tissue from Alzheimer’s disease patients. | ENSG00000123384 | LDL receptor related protein 1 | NA |
| CALD1 | 800 | This gene encodes a calmodulin- and actin-binding protein that plays an essential role in the regulation of smooth muscle and nonmuscle contraction. The conserved domain of this protein possesses the binding activities to Ca(2+)-calmodulin, actin, tropomyosin, myosin, and phospholipids. This protein is a potent inhibitor of the actin-tropomyosin activated myosin MgATPase, and serves as a mediating factor for Ca(2+)-dependent inhibition of smooth muscle contraction. Alternative splicing of this gene results in multiple transcript variants encoding distinct isoforms. | ENSG00000122786 | caldesmon 1 | NA |
| LDB3 | 11155 | This gene encodes a PDZ domain-containing protein. PDZ motifs are modular protein-protein interaction domains consisting of 80-120 amino acid residues. PDZ domain-containing proteins interact with each other in cytoskeletal assembly or with other proteins involved in targeting and clustering of membrane proteins. The protein encoded by this gene interacts with alpha-actinin-2 through its N-terminal PDZ domain and with protein kinase C via its C-terminal LIM domains. The LIM domain is a cysteine-rich motif defined by 50-60 amino acids containing two zinc-binding modules. This protein also interacts with all three members of the myozenin family. Mutations in this gene have been associated with myofibrillar myopathy and dilated cardiomyopathy. Alternatively spliced transcript variants encoding different isoforms have been identified; all isoforms have N-terminal PDZ domains while only longer isoforms (1, 2 and 5) have C-terminal LIM domains. | ENSG00000122367 | LIM domain binding 3 | NA |
| PPP1R3C | 5507 | This gene encodes a regulatory subunit of protein phosphatase-1 (PP1). PP1 catalyzes reversible protein phosphorylation, which is important in a wide range of cellular activities: neuronal, muscular, RNA splicing, protein synthesis, cell death, and glycogen metabolism, to name just a few. By interacting with different regulatory subunits, PP1 is directed to different parts of the cell, to different substrates, or to respond to extracellular signals. | ENSG00000119938 | protein phosphatase 1 regulatory subunit 3C | NA |
| TNNT2 | 7139 | The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. Mutations in this gene have been associated with familial hypertrophic cardiomyopathy as well as with dilated cardiomyopathy. Transcripts for this gene undergo alternative splicing that results in many tissue-specific isoforms, however, the full-length nature of some of these variants has not yet been determined. | ENSG00000118194 | troponin T2, cardiac type | NA |
| MYL4 | 4635 | Myosin is a hexameric ATPase cellular motor protein. It is composed of two myosin heavy chains, two nonphosphorylatable myosin alkali light chains, and two phosphorylatable myosin regulatory light chains. This gene encodes a myosin alkali light chain that is found in embryonic muscle and adult atria. Two alternatively spliced transcript variants encoding the same protein have been found for this gene. | ENSG00000198336 | myosin light chain 4 | NA |
| VCAN | 1462 | This gene is a member of the aggrecan/versican proteoglycan family. The protein encoded is a large chondroitin sulfate proteoglycan and is a major component of the extracellular matrix. This protein is involved in cell adhesion, proliferation, proliferation, migration and angiogenesis and plays a central role in tissue morphogenesis and maintenance. Mutations in this gene are the cause of Wagner syndrome type 1. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000038427 | versican | NA |
| MDH1 | 4190 | This gene encodes an enzyme that catalyzes the NAD/NADH-dependent, reversible oxidation of malate to oxaloacetate in many metabolic pathways, including the citric acid cycle. Two main isozymes are known to exist in eukaryotic cells: one is found in the mitochondrial matrix and the other in the cytoplasm. This gene encodes the cytosolic isozyme, which plays a key role in the malate-aspartate shuttle that allows malate to pass through the mitochondrial membrane to be transformed into oxaloacetate for further cellular processes. Alternatively spliced transcript variants have been found for this gene. A recent study showed that a C-terminally extended isoform is produced by use of an alternative in-frame translation termination codon via a stop codon readthrough mechanism, and that this isoform is localized in the peroxisomes. Pseudogenes have been identified on chromosomes X and 6. | ENSG00000014641 | malate dehydrogenase 1 | NA |
| ACTA2-AS1 | ENSG00000180139 | NA | ENSG00000180139 | ACTA2 antisense RNA 1 | NA |
| HSPA2 | 3306 | NA | ENSG00000126803 | heat shock protein family A (Hsp70) member 2 | NA |
| IGFBP3 | 3486 | This gene is a member of the insulin-like growth factor binding protein (IGFBP) family and encodes a protein with an IGFBP domain and a thyroglobulin type-I domain. The protein forms a ternary complex with insulin-like growth factor acid-labile subunit (IGFALS) and either insulin-like growth factor (IGF) I or II. In this form, it circulates in the plasma, prolonging the half-life of IGFs and altering their interaction with cell surface receptors. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. | ENSG00000146674 | insulin like growth factor binding protein 3 | NA |
| DKK1 | 22943 | This gene encodes a protein that is a member of the dickkopf family. It is a secreted protein with two cysteine rich regions and is involved in embryonic development through its inhibition of the WNT signaling pathway. Elevated levels of DKK1 in bone marrow plasma and peripheral blood is associated with the presence of osteolytic bone lesions in patients with multiple myeloma. | ENSG00000107984 | dickkopf WNT signaling pathway inhibitor 1 | NA |
| RPL3 | 6122 | Ribosomes, the complexes that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L3P family of ribosomal proteins and it is located in the cytoplasm. The protein can bind to the HIV-1 TAR mRNA, and it has been suggested that the protein contributes to tat-mediated transactivation. This gene is co-transcribed with several small nucleolar RNA genes, which are located in several of this gene’s introns. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | ENSG00000100316 | ribosomal protein L3 | NA |
| MRC2 | 9902 | This gene encodes a member of the mannose receptor family of proteins that contain a fibronectin type II domain and multiple C-type lectin-like domains. The encoded protein plays a role in extracellular matrix remodeling by mediating the internalization and lysosomal degradation of collagen ligands. Expression of this gene may play a role in the tumorigenesis and metastasis of several malignancies including breast cancer, gliomas and metastatic bone disease. | ENSG00000011028 | mannose receptor C type 2 | NA |
| FBLN1 | 2192 | Fibulin 1 is a secreted glycoprotein that becomes incorporated into a fibrillar extracellular matrix. Calcium-binding is apparently required to mediate its binding to laminin and nidogen. It mediates platelet adhesion via binding fibrinogen. Four splice variants which differ in the 3’ end have been identified. Each variant encodes a different isoform, but no functional distinctions have been identified among the four variants. | ENSG00000077942 | fibulin 1 | NA |
| LDLRAP1 | 26119 | The protein encoded by this gene is a cytosolic protein which contains a phosphotyrosine binding (PTD) domain. The PTD domain has been found to interact with the cytoplasmic tail of the LDL receptor. Mutations in this gene lead to LDL receptor malfunction and cause the disorder autosomal recessive hypercholesterolaemia. | ENSG00000157978 | low density lipoprotein receptor adaptor protein 1 | NA |
| RPS6 | 6194 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a cytoplasmic ribosomal protein that is a component of the 40S subunit. The protein belongs to the S6E family of ribosomal proteins. It is the major substrate of protein kinases in the ribosome, with subsets of five C-terminal serine residues phosphorylated by different protein kinases. Phosphorylation is induced by a wide range of stimuli, including growth factors, tumor-promoting agents, and mitogens. Dephosphorylation occurs at growth arrest. The protein may contribute to the control of cell growth and proliferation through the selective translation of particular classes of mRNA. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | ENSG00000137154 | ribosomal protein S6 | NA |
| GPD1 | 2819 | This gene encodes a member of the NAD-dependent glycerol-3-phosphate dehydrogenase family. The encoded protein plays a critical role in carbohydrate and lipid metabolism by catalyzing the reversible conversion of dihydroxyacetone phosphate (DHAP) and reduced nicotine adenine dinucleotide (NADH) to glycerol-3-phosphate (G3P) and NAD+. The encoded cytosolic protein and mitochondrial glycerol-3-phosphate dehydrogenase also form a glycerol phosphate shuttle that facilitates the transfer of reducing equivalents from the cytosol to mitochondria. Mutations in this gene are a cause of transient infantile hypertriglyceridemia. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | ENSG00000167588 | glycerol-3-phosphate dehydrogenase 1 | NA |
| NOV | 4856 | The protein encoded by this gene is a small secreted cysteine-rich protein and a member of the CCN family of regulatory proteins. CNN family proteins associate with the extracellular matrix and play an important role in cardiovascular and skeletal development, fibrosis and cancer development. | ENSG00000136999 | nephroblastoma overexpressed | NA |
| PTGDS | 5730 | The protein encoded by this gene is a glutathione-independent prostaglandin D synthase that catalyzes the conversion of prostaglandin H2 (PGH2) to postaglandin D2 (PGD2). PGD2 functions as a neuromodulator as well as a trophic factor in the central nervous system. PGD2 is also involved in smooth muscle contraction/relaxation and is a potent inhibitor of platelet aggregation. This gene is preferentially expressed in brain. Studies with transgenic mice overexpressing this gene suggest that this gene may be also involved in the regulation of non-rapid eye movement sleep. | ENSG00000107317 | prostaglandin D2 synthase | NA |
| COL5A2 | 1290 | This gene encodes an alpha chain for one of the low abundance fibrillar collagens. Fibrillar collagen molecules are trimers that can be composed of one or more types of alpha chains. Type V collagen is found in tissues containing type I collagen and appears to regulate the assembly of heterotypic fibers composed of both type I and type V collagen. This gene product is closely related to type XI collagen and it is possible that the collagen chains of types V and XI constitute a single collagen type with tissue-specific chain combinations. Mutations in this gene are associated with Ehlers-Danlos syndrome, types I and II. | ENSG00000204262 | collagen type V alpha 2 chain | NA |
| CTGF | 1490 | The protein encoded by this gene is a mitogen that is secreted by vascular endothelial cells. The encoded protein plays a role in chondrocyte proliferation and differentiation, cell adhesion in many cell types, and is related to platelet-derived growth factor. Certain polymorphisms in this gene have been linked with a higher incidence of systemic sclerosis. | ENSG00000118523 | connective tissue growth factor | NA |
| NA | NA | NA | ENSG00000163486 | NA | TRUE |
| GPNMB | 10457 | The protein encoded by this gene is a type I transmembrane glycoprotein which shows homology to the pMEL17 precursor, a melanocyte-specific protein. GPNMB shows expression in the lowly metastatic human melanoma cell lines and xenografts but does not show expression in the highly metastatic cell lines. GPNMB may be involved in growth delay and reduction of metastatic potential. Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000136235 | glycoprotein nmb | NA |
| TNC | 3371 | This gene encodes an extracellular matrix protein with a spatially and temporally restricted tissue distribution. This protein is homohexameric with disulfide-linked subunits, and contains multiple EGF-like and fibronectin type-III domains. It is implicated in guidance of migrating neurons as well as axons during development, synaptic plasticity, and neuronal regeneration. | ENSG00000041982 | tenascin C | NA |
| PNPLA2 | 57104 | This gene encodes an enzyme which catalyzes the first step in the hydrolysis of triglycerides in adipose tissue. Mutations in this gene are associated with neutral lipid storage disease with myopathy. | ENSG00000177666 | patatin like phospholipase domain containing 2 | NA |
| PLIN1 | 5346 | The protein encoded by this gene coats lipid storage droplets in adipocytes, thereby protecting them until they can be broken down by hormone-sensitive lipase. The encoded protein is the major cAMP-dependent protein kinase substrate in adipocytes and, when unphosphorylated, may play a role in the inhibition of lipolysis. Alternatively spliced transcript variants varying in the 5’ UTR, but encoding the same protein, have been found for this gene. | ENSG00000166819 | perilipin 1 | NA |
| SLC25A3 | 5250 | The protein encoded by this gene catalyzes the transport of phosphate into the mitochondrial matrix, either by proton cotransport or in exchange for hydroxyl ions. The protein contains three related segments arranged in tandem which are related to those found in other characterized members of the mitochondrial carrier family. Both the N-terminal and C-terminal regions of this protein protrude toward the cytosol. Multiple alternatively spliced transcript variants have been isolated. | ENSG00000075415 | solute carrier family 25 member 3 | NA |
| CYB5R3 | 1727 | This gene encodes cytochrome b5 reductase, which includes a membrane-bound form in somatic cells (anchored in the endoplasmic reticulum, mitochondrial and other membranes) and a soluble form in erythrocytes. The membrane-bound form exists mainly on the cytoplasmic side of the endoplasmic reticulum and functions in desaturation and elongation of fatty acids, in cholesterol biosynthesis, and in drug metabolism. The erythrocyte form is located in a soluble fraction of circulating erythrocytes and is involved in methemoglobin reduction. The membrane-bound form has both membrane-binding and catalytic domains, while the soluble form has only the catalytic domain. Alternate splicing results in multiple transcript variants. Mutations in this gene cause methemoglobinemias. | ENSG00000100243 | cytochrome b5 reductase 3 | NA |
| CDH2 | 1000 | This gene encodes a classical cadherin and member of the cadherin superfamily. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein is proteolytically processed to generate a calcium-dependent cell adhesion molecule and glycoprotein. This protein plays a role in the establishment of left-right asymmetry, development of the nervous system and the formation of cartilage and bone. | ENSG00000170558 | cadherin 2 | NA |
| PALLD | 23022 | This gene encodes a cytoskeletal protein that is required for organizing the actin cytoskeleton. The protein is a component of actin-containing microfilaments, and it is involved in the control of cell shape, adhesion, and contraction. Polymorphisms in this gene are associated with a susceptibility to pancreatic cancer type 1, and also with a risk for myocardial infarction. Alternative splicing results in multiple transcript variants. | ENSG00000129116 | palladin, cytoskeletal associated protein | NA |
| TIAM1 | 7074 | NA | ENSG00000156299 | T-cell lymphoma invasion and metastasis 1 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",14,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[15,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | X_id | name | query | summary | notfound |
|---|---|---|---|---|---|
| PRSS1 | 5644 | protease, serine 1 | ENSG00000204983 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | NA |
| CPA1 | 1357 | carboxypeptidase A1 | ENSG00000091704 | This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. | NA |
| PNLIP | 5406 | pancreatic lipase | ENSG00000175535 | This gene is a member of the lipase gene family. It encodes a carboxyl esterase that hydrolyzes insoluble, emulsified triglycerides, and is essential for the efficient digestion of dietary fats. This gene is expressed specifically in the pancreas. | NA |
| CELA3A | 10136 | chymotrypsin like elastase family member 3A | ENSG00000142789 | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. | NA |
| GP2 | 2813 | glycoprotein 2 | ENSG00000169347 | This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. | NA |
| MBP | 4155 | myelin basic protein | ENSG00000197971 | The protein encoded by the classic MBP gene is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. However, MBP-related transcripts are also present in the bone marrow and the immune system. These mRNAs arise from the long MBP gene (otherwise called ‘Golli-MBP’) that contains 3 additional exons located upstream of the classic MBP exons. Alternative splicing from the Golli and the MBP transcription start sites gives rise to 2 sets of MBP-related transcripts and gene products. The Golli mRNAs contain 3 exons unique to Golli-MBP, spliced in-frame to 1 or more MBP exons. They encode hybrid proteins that have N-terminal Golli aa sequence linked to MBP aa sequence. The second family of transcripts contain only MBP exons and produce the well characterized myelin basic proteins. This complex gene structure is conserved among species suggesting that the MBP transcription unit is an integral part of the Golli transcription unit and that this arrangement is important for the function and/or regulation of these genes. | NA |
| CPB1 | 1360 | carboxypeptidase B1 | ENSG00000153002 | Three different procarboxypeptidases A and two different procarboxypeptidases B have been isolated. The B1 and B2 forms differ from each other mainly in isoelectric point. Carboxypeptidase B1 is a highly tissue-specific protein and is a useful serum marker for acute pancreatitis and dysfunction of pancreatic transplants. It is not elevated in pancreatic carcinoma. | NA |
| CLPS | 1208 | colipase | ENSG00000137392 | The protein encoded by this gene is a cofactor needed by pancreatic lipase for efficient dietary lipid hydrolysis. It binds to the C-terminal, non-catalytic domain of lipase, thereby stabilizing an active conformation and considerably increasing the overall hydrophobic binding site. The gene product allows lipase to anchor noncovalently to the surface of lipid micelles, counteracting the destabilizing influence of intestinal bile salts. This cofactor is only expressed in pancreatic acinar cells, suggesting regulation of expression by tissue-specific elements. Three transcript variants encoding different isoforms have been found for this gene. | NA |
| CELA3B | 23436 | chymotrypsin like elastase family member 3B | ENSG00000219073 | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3B has little elastolytic activity. Like most of the human elastases, elastase 3B is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3B preferentially cleaves proteins after alanine residues. Elastase 3B may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1, and excretion of this protein in fecal material is frequently used as a measure of pancreatic function in clinical assays. | NA |
| CTRB2 | 440387 | chymotrypsinogen B2 | ENSG00000168928 | NA | NA |
| CTRB1 | 1504 | chymotrypsinogen B1 | ENSG00000168925 | The protein encoded by this gene is one of a family of serine proteases that is secreted into the gastrointestinal tract as an inactive precursor, which is activated by proteolytic cleavage with trypsin. | NA |
| CEL | 1056 | carboxyl ester lipase | ENSG00000170835 | The protein encoded by this gene is a glycoprotein secreted from the pancreas into the digestive tract and from the lactating mammary gland into human milk. The physiological role of this protein is in cholesterol and lipid-soluble vitamin ester hydrolysis and absorption. This encoded protein promotes large chylomicron production in the intestine. Also its presence in plasma suggests its interactions with cholesterol and oxidized lipoproteins to modulate the progression of atherosclerosis. In pancreatic tumoral cells, this encoded protein is thought to be sequestrated within the Golgi compartment and is probably not secreted. This gene contains a variable number of tandem repeat (VNTR) polymorphism in the coding region that may influence the function of the encoded protein. | NA |
| AMY2B | 280 | amylase, alpha 2B (pancreatic) | ENSG00000240038 | Amylases are secreted proteins that hydrolyze 1,4-alpha-glucoside bonds in oligosaccharides and polysaccharides, and thus catalyze the first step in digestion of dietary starch and glycogen. The human genome has a cluster of several amylase genes that are expressed at high levels in either salivary gland or pancreas. This gene encodes an amylase isoenzyme produced by the pancreas. | NA |
| HBB | 3043 | hemoglobin subunit beta | ENSG00000244734 | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | NA |
| REG1A | 5967 | regenerating family member 1 alpha | ENSG00000115386 | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | NA |
| CPA2 | 1358 | carboxypeptidase A2 | ENSG00000158516 | Three different forms of human pancreatic procarboxypeptidase A have been isolated. The encoded protein represents the A2 form, which is a monomeric protein with different biochemical properties from the A1 and A3 forms. The A2 form of pancreatic procarboxypeptidase acts on aromatic C-terminal residues and is a secreted protein. | NA |
| KRT13 | 3860 | keratin 13 | ENSG00000171401 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | NA |
| CELA2A | 63036 | chymotrypsin like elastase family member 2A | ENSG00000142615 | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Like most of the human elastases, elastase 2A is secreted from the pancreas as a zymogen. In other species, elastase 2A has been shown to preferentially cleave proteins after leucine, methionine, and phenylalanine residues. | NA |
| AMY2A | 279 | amylase, alpha 2A (pancreatic) | ENSG00000243480 | This gene encodes a member of the alpha-amylase family of proteins. Amylases are secreted proteins that hydrolyze 1,4-alpha-glucoside bonds in oligosaccharides and polysaccharides, catalyzing the first step in digestion of dietary starch and glycogen. This gene and several family members are present in a gene cluster on chromosome 1. This gene encodes an amylase isoenzyme produced by the pancreas. | NA |
| CTRC | 11330 | chymotrypsin C | ENSG00000162438 | This gene encodes a member of the peptidase S1 family. The encoded protein is a serum calcium-decreasing factor that has chymotrypsin-like protease activity. Alternatively spliced transcript variants have been observed, but their full-length nature has not been determined. | NA |
| PLA2G1B | 5319 | phospholipase A2 group IB | ENSG00000170890 | This gene encodes a secreted member of the phospholipase A2 (PLA2) class of enzymes, which is produced by the pancreatic acinar cells. The encoded calcium-dependent enzyme catalyzes the hydrolysis of the sn-2 position of membrane glycerophospholipids to release arachidonic acid (AA) and lysophospholipids. AA is subsequently converted by downstream metabolic enzymes to several bioactive lipophilic compounds (eicosanoids), including prostaglandins (PGs) and leukotrienes (LTs). The enzyme may be involved in several physiological processes including cell contraction, cell proliferation and pathological response. | NA |
| PNLIPRP1 | 5407 | pancreatic lipase related protein 1 | ENSG00000187021 | NA | NA |
| AHNAK | 79026 | AHNAK nucleoprotein | ENSG00000124942 | NA | NA |
| IGFBP5 | 3488 | insulin like growth factor binding protein 5 | ENSG00000115461 | NA | NA |
| RP11-862L9.3 | ENSG00000266844 | NA | ENSG00000266844 | NA | NA |
| FN1 | 2335 | fibronectin 1 | ENSG00000115414 | This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | NA |
| KRT4 | 3851 | keratin 4 | ENSG00000170477 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | NA |
| NA | NA | NA | ENSG00000250606 | NA | TRUE |
| HBA2 | 3040 | hemoglobin subunit alpha 2 | ENSG00000188536 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | NA |
| TPM2 | 7169 | tropomyosin 2 (beta) | ENSG00000198467 | This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| NA | NA | NA | ENSG00000165862 | NA | TRUE |
| REG1B | 5968 | regenerating family member 1 beta | ENSG00000172023 | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV based on the primary structures of the encoded proteins. This gene encodes a protein secreted by the exocrine pancreas that is highly similar to the REG1A protein. The related REG1A protein is associated with islet cell regeneration and diabetogenesis, and may be involved in pancreatic lithogenesis. Reg family members REG1A, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | NA |
| RP11-331F4.4 | ENSG00000240338 | NA | ENSG00000240338 | NA | NA |
| GSN | 2934 | gelsolin | ENSG00000148180 | The protein encoded by this gene binds to the ‘plus’ ends of actin monomers and filaments to prevent monomer exchange. The encoded calcium-regulated protein functions in both assembly and disassembly of actin filaments. Defects in this gene are a cause of familial amyloidosis Finnish type (FAF). Multiple transcript variants encoding several different isoforms have been found for this gene. | NA |
| SYTL1 | 84958 | synaptotagmin like 1 | ENSG00000142765 | NA | NA |
| SEL1L | 6400 | SEL1L ERAD E3 ligase adaptor subunit | ENSG00000071537 | The protein encoded by this gene is part of a protein complex required for the retrotranslocation or dislocation of misfolded proteins from the endoplasmic reticulum lumen to the cytosol, where they are degraded by the proteasome in a ubiquitin-dependent manner. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| SPRR3 | 6707 | small proline rich protein 3 | ENSG00000163209 | NA | NA |
| PTGDS | 5730 | prostaglandin D2 synthase | ENSG00000107317 | The protein encoded by this gene is a glutathione-independent prostaglandin D synthase that catalyzes the conversion of prostaglandin H2 (PGH2) to postaglandin D2 (PGD2). PGD2 functions as a neuromodulator as well as a trophic factor in the central nervous system. PGD2 is also involved in smooth muscle contraction/relaxation and is a potent inhibitor of platelet aggregation. This gene is preferentially expressed in brain. Studies with transgenic mice overexpressing this gene suggest that this gene may be also involved in the regulation of non-rapid eye movement sleep. | NA |
| HSP90AA1 | 3320 | heat shock protein 90kDa alpha family class A member 1 | ENSG00000080824 | The protein encoded by this gene is an inducible molecular chaperone that functions as a homodimer. The encoded protein aids in the proper folding of specific target proteins by use of an ATPase activity that is modulated by co-chaperones. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| COL6A3 | 1293 | collagen type VI alpha 3 chain | ENSG00000163359 | This gene encodes the alpha-3 chain, one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The alpha-3 chain of type VI collagen is much larger than the alpha-1 and -2 chains. This difference in size is largely due to an increase in the number of subdomains, similar to von Willebrand Factor type A domains, that are found in the amino terminal globular domain of all the alpha chains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in the type VI collagen genes are associated with Bethlem myopathy, a rare autosomal dominant proximal myopathy with early childhood onset. Mutations in this gene are also a cause of Ullrich congenital muscular dystrophy, also referred to as Ullrich scleroatonic muscular dystrophy, an autosomal recessive congenital myopathy that is more severe than Bethlem myopathy. Multiple transcript variants have been identified, but the full-length nature of only some of these variants has been described. | NA |
| MTURN | 222166 | maturin, neural progenitor differentiation regulator homolog (Xenopus) | ENSG00000180354 | NA | NA |
| NSMF | 26012 | NMDA receptor synaptonuclear signaling and neuronal migration factor | ENSG00000165802 | The protein encoded by this gene is involved in guidance of olfactory axon projections and migration of luteinizing hormone-releasing hormone neurons. Defects in this gene are a cause of idiopathic hypogonadotropic hypogonadism (IHH). Several transcript variants encoding different isoforms have been found for this gene. | NA |
| CLU | 1191 | clusterin | ENSG00000120885 | The protein encoded by this gene is a secreted chaperone that can under some stress conditions also be found in the cell cytosol. It has been suggested to be involved in several basic biological events such as cell death, tumor progression, and neurodegenerative disorders. Alternate splicing results in both coding and non-coding variants. | NA |
| MT2A | 4502 | metallothionein 2A | ENSG00000125148 | NA | NA |
| SYCN | 342898 | syncollin | ENSG00000179751 | NA | NA |
| AC019349.5 | ENSG00000229732 | NA | ENSG00000229732 | NA | NA |
| UBB | 7314 | ubiquitin B | ENSG00000170315 | This gene encodes ubiquitin, one of the most conserved proteins known. Ubiquitin has a major role in targeting cellular proteins for degradation by the 26S proteosome. It is also involved in the maintenance of chromatin structure, the regulation of gene expression, and the stress response. Ubiquitin is synthesized as a precursor protein consisting of either polyubiquitin chains or a single ubiquitin moiety fused to an unrelated protein. This gene consists of three direct repeats of the ubiquitin coding sequence with no spacer sequence. Consequently, the protein is expressed as a polyubiquitin precursor with a final amino acid after the last repeat. An aberrant form of this protein has been detected in patients with Alzheimer’s disease and Down syndrome. Pseudogenes of this gene are located on chromosomes 1, 2, 13, and 17. Alternative splicing results in multiple transcript variants. | NA |
| CCDC136 | 64753 | coiled-coil domain containing 136 | ENSG00000128596 | NA | NA |
| PLVAP | 83483 | plasmalemma vesicle associated protein | ENSG00000130300 | NA | NA |
| TF | 7018 | transferrin | ENSG00000091513 | This gene encodes a glycoprotein with an approximate molecular weight of 76.5 kDa. It is thought to have been created as a result of an ancient gene duplication event that led to generation of homologous C and N-terminal domains each of which binds one ion of ferric iron. The function of this protein is to transport iron from the intestine, reticuloendothelial system, and liver parenchymal cells to all proliferating cells in the body. This protein may also have a physiologic role as granulocyte/pollen-binding protein (GPBP) involved in the removal of certain organic matter and allergens from serum. | NA |
| PLCH2 | 9651 | phospholipase C eta 2 | ENSG00000149527 | PLCH2 is a member of the PLC-eta family of the phosphoinositide-specific phospholipase C (PLC) superfamily of enzymes that cleave PtdIns(4,5) P2 to generate second messengers inositol 1,4,5-trisphosphate and diacylglycerol (Zhou et al., 2005 [PubMed 16107206]). | NA |
| HBA1 | 3039 | hemoglobin subunit alpha 1 | ENSG00000206172 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | NA |
| ITM2C | 81618 | integral membrane protein 2C | ENSG00000135916 | NA | NA |
| RPS3 | 6188 | ribosomal protein S3 | ENSG00000149273 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit, where it forms part of the domain where translation is initiated. The protein belongs to the S3P family of ribosomal proteins. Studies of the mouse and rat proteins have demonstrated that the protein has an extraribosomal role as an endonuclease involved in the repair of UV-induced DNA damage. The protein appears to be located in both the cytoplasm and nucleus but not in the nucleolus. Higher levels of expression of this gene in colon adenocarcinomas and adenomatous polyps compared to adjacent normal colonic mucosa have been observed. This gene is co-transcribed with the small nucleolar RNA genes U15A and U15B, which are located in its first and fifth introns, respectively. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Multiple alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| CALM2 | 805 | calmodulin 2 (phosphorylase kinase, delta) | ENSG00000143933 | This gene is a member of the calmodulin gene family. There are three distinct calmodulin genes dispersed throughout the genome that encode the identical protein, but differ at the nucleotide level. Calmodulin is a calcium binding protein that plays a role in signaling pathways, cell cycle progression and proliferation. Several infants with severe forms of long-QT syndrome (LQTS) who displayed life-threatening ventricular arrhythmias together with delayed neurodevelopment and epilepsy were found to have mutations in either this gene or another member of the calmodulin gene family (PMID:23388215). Mutations in this gene have also been identified in patients with less severe forms of LQTS (PMID:24917665), while mutations in another calmodulin gene family member have been associated with catecholaminergic polymorphic ventricular tachycardia (CPVT)(PMID:23040497), a rare disorder thought to be the cause of a significant fraction of sudden cardiac deaths in young individuals. Pseudogenes of this gene are found on chromosomes 10, 13, and 17. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| CRNN | 49860 | cornulin | ENSG00000143536 | This gene encodes a member of the ‘fused gene’ family of proteins, which contain N-terminus EF-hand domains and multiple tandem peptide repeats. The encoded protein contains two EF-hand Ca2+ binding domains in its N-terminus and two glutamine- and threonine-rich 60 amino acid repeats in its C-terminus. This gene, also known as squamous epithelial heat shock protein 53, may play a role in the mucosal/epithelial immune response and epidermal differentiation. | NA |
| MAST3 | 23031 | microtubule associated serine/threonine kinase 3 | ENSG00000099308 | NA | NA |
| HLA-B | 3106 | major histocompatibility complex, class I, B | ENSG00000234745 | HLA-B belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. Class I molecules play a central role in the immune system by presenting peptides derived from the endoplasmic reticulum lumen. They are expressed in nearly all cells. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon 1 encodes the leader peptide, exon 2 and 3 encode the alpha1 and alpha2 domains, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region and exons 6 and 7 encode the cytoplasmic tail. Polymorphisms within exon 2 and exon 3 are responsible for the peptide binding specificity of each class one molecule. Typing for these polymorphisms is routinely done for bone marrow and kidney transplantation. Hundreds of HLA-B alleles have been described. | NA |
| KRT6A | 3853 | keratin 6A | ENSG00000205420 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. As many as six of this type II cytokeratin (KRT6) have been identified; the multiplicity of the genes is attributed to successive gene duplication events. The genes are expressed with family members KRT16 and/or KRT17 in the filiform papillae of the tongue, the stratified epithelial lining of oral mucosa and esophagus, the outer root sheath of hair follicles, and the glandular epithelia. This KRT6 gene in particular encodes the most abundant isoform. Mutations in these genes have been associated with pachyonychia congenita. In addition, peptides from the C-terminal region of the protein have antimicrobial activity against bacterial pathogens. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | NA |
| TAGLN | 6876 | transgelin | ENSG00000149591 | The protein encoded by this gene is a transformation and shape-change sensitive actin cross-linking/gelling protein found in fibroblasts and smooth muscle. Its expression is down-regulated in many cell lines, and this down-regulation may be an early and sensitive marker for the onset of transformation. A functional role of this protein is unclear. Two transcript variants encoding the same protein have been found for this gene. | NA |
| FGFR3 | 2261 | fibroblast growth factor receptor 3 | ENSG00000068078 | This gene encodes a member of the fibroblast growth factor receptor (FGFR) family, with its amino acid sequence being highly conserved between members and among divergent species. FGFR family members differ from one another in their ligand affinities and tissue distribution. A full-length representative protein would consist of an extracellular region, composed of three immunoglobulin-like domains, a single hydrophobic membrane-spanning segment and a cytoplasmic tyrosine kinase domain. The extracellular portion of the protein interacts with fibroblast growth factors, setting in motion a cascade of downstream signals, ultimately influencing mitogenesis and differentiation. This particular family member binds acidic and basic fibroblast growth hormone and plays a role in bone development and maintenance. Mutations in this gene lead to craniosynostosis and multiple types of skeletal dysplasia. Three alternatively spliced transcript variants that encode different protein isoforms have been described. | NA |
| CHGA | 1113 | chromogranin A | ENSG00000100604 | The protein encoded by this gene is a member of the chromogranin/secretogranin family of neuroendocrine secretory proteins. It is found in secretory vesicles of neurons and endocrine cells. This gene product is a precursor to three biologically active peptides; vasostatin, pancreastatin, and parastatin. These peptides act as autocrine or paracrine negative modulators of the neuroendocrine system. Two other peptides, catestatin and chromofungin, have antimicrobial activity and antifungal activity, respectively. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| STMN1 | 3925 | stathmin 1 | ENSG00000117632 | This gene belongs to the stathmin family of genes. It encodes a ubiquitous cytosolic phosphoprotein proposed to function as an intracellular relay integrating regulatory signals of the cellular environment. The encoded protein is involved in the regulation of the microtubule filament system by destabilizing microtubules. It prevents assembly and promotes disassembly of microtubules. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| ADH1B | 125 | alcohol dehydrogenase 1B (class I), beta polypeptide | ENSG00000196616 | The protein encoded by this gene is a member of the alcohol dehydrogenase family. Members of this enzyme family metabolize a wide variety of substrates, including ethanol, retinol, other aliphatic alcohols, hydroxysteroids, and lipid peroxidation products. This encoded protein, consisting of several homo- and heterodimers of alpha, beta, and gamma subunits, exhibits high activity for ethanol oxidation and plays a major role in ethanol catabolism. Three genes encoding alpha, beta and gamma subunits are tandemly organized in a genomic segment as a gene cluster. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| MT3 | 4504 | metallothionein 3 | ENSG00000087250 | NA | NA |
| CERCAM | 51148 | cerebral endothelial cell adhesion molecule | ENSG00000167123 | NA | NA |
| NEAT1 | 283131 | nuclear paraspeckle assembly transcript 1 (non-protein coding) | ENSG00000245532 | This gene produces a long non-coding RNA (lncRNA) transcribed from the multiple endocrine neoplasia locus. This lncRNA is retained in the nucleus where it forms the core structural component of the paraspeckle sub-organelles. It may act as a transcriptional regulator for numerous genes, including some genes involved in cancer progression. | NA |
| MTCO1P12 | ENSG00000237973 | MT-CO1 pseudogene 12 | ENSG00000237973 | NA | NA |
| HSPG2 | 3339 | heparan sulfate proteoglycan 2 | ENSG00000142798 | This gene encodes the perlecan protein, which consists of a core protein to which three long chains of glycosaminoglycans (heparan sulfate or chondroitin sulfate) are attached. The perlecan protein is a large multidomain proteoglycan that binds to and cross-links many extracellular matrix components and cell-surface molecules. It has been shown that this protein interacts with laminin, prolargin, collagen type IV, FGFBP1, FBLN2, FGF7 and transthyretin, etc., and it plays essential roles in multiple biological activities. Perlecan is a key component of the vascular extracellular matrix, where it helps to maintain the endothelial barrier function. It is a potent inhibitor of smooth muscle cell proliferation and is thus thought to help maintain vascular homeostasis. It can also promote growth factor (e.g., FGF2) activity and thus stimulate endothelial growth and re-generation. It is a major component of basement membranes, where it is involved in the stabilization of other molecules as well as being involved with glomerular permeability to macromolecules and cell adhesion. Mutations in this gene cause Schwartz-Jampel syndrome type 1, Silverman-Handmaker type of dyssegmental dysplasia, and tardive dyskinesia. Alternative splicing of this gene results in multiple transcript variants. | NA |
| APOD | 347 | apolipoprotein D | ENSG00000189058 | This gene encodes a component of high density lipoprotein that has no marked similarity to other apolipoprotein sequences. It has a high degree of homology to plasma retinol-binding protein and other members of the alpha 2 microglobulin protein superfamily of carrier proteins, also known as lipocalins. This glycoprotein is closely associated with the enzyme lecithin:cholesterol acyltransferase - an enzyme involved in lipoprotein metabolism. | NA |
| ODF2 | 4957 | outer dense fiber of sperm tails 2 | ENSG00000136811 | The outer dense fibers are cytoskeletal structures that surround the axoneme in the middle piece and principal piece of the sperm tail. The fibers function in maintaining the elastic structure and recoil of the sperm tail as well as in protecting the tail from shear forces during epididymal transport and ejaculation. Defects in the outer dense fibers lead to abnormal sperm morphology and infertility. This gene encodes one of the major outer dense fiber proteins. Alternative splicing results in multiple transcript variants. The longer transcripts, also known as ‘Cenexins’, encode proteins with a C-terminal extension that are differentially targeted to somatic centrioles and thought to be crucial for the formation of microtubule organizing centers. | NA |
| NA | NA | NA | ENSG00000259716 | NA | TRUE |
| KRT5 | 3852 | keratin 5 | ENSG00000186081 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the basal layer of the epidermis with family member KRT14. Mutations in these genes have been associated with a complex of diseases termed epidermolysis bullosa simplex. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | NA |
| PLIN2 | 123 | perilipin 2 | ENSG00000147872 | The protein encoded by this gene belongs to the perilipin family, members of which coat intracellular lipid storage droplets. This protein is associated with the lipid globule surface membrane material, and maybe involved in development and maintenance of adipose tissue. However, it is not restricted to adipocytes as previously thought, but is found in a wide range of cultured cell lines, including fibroblasts, endothelial and epithelial cells, and tissues, such as lactating mammary gland, adrenal cortex, Sertoli and Leydig cells, and hepatocytes in alcoholic liver cirrhosis, suggesting that it may serve as a marker of lipid accumulation in diverse cell types and diseases. Alternatively spliced transcript variants have been found for this gene. | NA |
| RHCG | 51458 | Rh family C glycoprotein | ENSG00000140519 | NA | NA |
| ENHO | 375704 | energy homeostasis associated | ENSG00000168913 | NA | NA |
| RPL13A | 23521 | ribosomal protein L13a | ENSG00000142541 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a member of the L13P family of ribosomal proteins that is a component of the 60S subunit. The encoded protein also plays a role in the repression of inflammatory genes as a component of the IFN-gamma-activated inhibitor of translation (GAIT) complex. This gene is co-transcribed with the small nucleolar RNA genes U32, U33, U34, and U35, which are located in the second, fourth, fifth, and sixth introns, respectively. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed throughout the genome. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| FASN | 2194 | fatty acid synthase | ENSG00000169710 | The enzyme encoded by this gene is a multifunctional protein. Its main function is to catalyze the synthesis of palmitate from acetyl-CoA and malonyl-CoA, in the presence of NADPH, into long-chain saturated fatty acids. In some cancer cell lines, this protein has been found to be fused with estrogen receptor-alpha (ER-alpha), in which the N-terminus of FAS is fused in-frame with the C-terminus of ER-alpha. | NA |
| ACTA1 | 58 | actin, alpha 1, skeletal muscle | ENSG00000143632 | The product encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Mutations in this gene cause nemaline myopathy type 3, congenital myopathy with excess of thin myofilaments, congenital myopathy with cores, and congenital myopathy with fiber-type disproportion, diseases that lead to muscle fiber defects. | NA |
| RPS6 | 6194 | ribosomal protein S6 | ENSG00000137154 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a cytoplasmic ribosomal protein that is a component of the 40S subunit. The protein belongs to the S6E family of ribosomal proteins. It is the major substrate of protein kinases in the ribosome, with subsets of five C-terminal serine residues phosphorylated by different protein kinases. Phosphorylation is induced by a wide range of stimuli, including growth factors, tumor-promoting agents, and mitogens. Dephosphorylation occurs at growth arrest. The protein may contribute to the control of cell growth and proliferation through the selective translation of particular classes of mRNA. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | NA |
| PPL | 5493 | periplakin | ENSG00000118898 | The protein encoded by this gene is a component of desmosomes and of the epidermal cornified envelope in keratinocytes. The N-terminal domain of this protein interacts with the plasma membrane and its C-terminus interacts with intermediate filaments. Through its rod domain, this protein forms complexes with envoplakin. This protein may serve as a link between the cornified envelope and desmosomes as well as intermediate filaments. AKT1/PKB, a protein kinase mediating a variety of cell growth and survival signaling processes, is reported to interact with this protein, suggesting a possible role for this protein as a localization signal in AKT1-mediated signaling. | NA |
| CELA2B | 51032 | chymotrypsin like elastase family member 2B | ENSG00000215704 | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Like most of the human elastases, elastase 2B is secreted from the pancreas as a zymogen. In other species, elastase 2B has been shown to preferentially cleave proteins after leucine, methionine, and phenylalanine residues. | NA |
| GNG7 | 2788 | G protein subunit gamma 7 | ENSG00000176533 | NA | NA |
| FBXL16 | 146330 | F-box and leucine rich repeat protein 16 | ENSG00000127585 | Members of the F-box protein family, such as FBXL16, are characterized by an approximately 40-amino acid F-box motif. SCF complexes, formed by SKP1 (MIM 601434), cullin (see CUL1; MIM 603134), and F-box proteins, act as protein-ubiquitin ligases. F-box proteins interact with SKP1 through the F box, and they interact with ubiquitination targets through other protein interaction domains (Jin et al., 2004 [PubMed 15520277]). | NA |
| SPINT2 | 10653 | serine peptidase inhibitor, Kunitz type, 2 | ENSG00000167642 | This gene encodes a transmembrane protein with two extracellular Kunitz domains that inhibits a variety of serine proteases. The protein inhibits HGF activator which prevents the formation of active hepatocyte growth factor. This gene is a putative tumor suppressor, and mutations in this gene result in congenital sodium diarrhea. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| JUP | 3728 | junction plakoglobin | ENSG00000173801 | This gene encodes a major cytoplasmic protein which is the only known constituent common to submembranous plaques of both desmosomes and intermediate junctions. This protein forms distinct complexes with cadherins and desmosomal cadherins and is a member of the catenin family since it contains a distinct repeating amino acid motif called the armadillo repeat. Mutation in this gene has been associated with Naxos disease. Alternative splicing occurs in this gene; however, not all transcripts have been fully described. | NA |
| LOC105372824 | 105372824 | uncharacterized LOC105372824 | ENSG00000160209 | NA | NA |
| PDXK | 8566 | pyridoxal (pyridoxine, vitamin B6) kinase | ENSG00000160209 | The protein encoded by this gene phosphorylates vitamin B6, a step required for the conversion of vitamin B6 to pyridoxal-5-phosphate, an important cofactor in intermediary metabolism. The encoded protein is cytoplasmic and probably acts as a homodimer. Alternatively spliced transcript variants have been described, but their biological validity has not been determined. | NA |
| NRGN | 4900 | neurogranin | ENSG00000154146 | Neurogranin (NRGN) is the human homolog of the neuron-specific rat RC3/neurogranin gene. This gene encodes a postsynaptic protein kinase substrate that binds calmodulin in the absence of calcium. The NRGN gene contains four exons and three introns. The exons 1 and 2 encode the protein and exons 3 and 4 contain untranslated sequences. It is suggested that the NRGN is a direct target for thyroid hormone in human brain, and that control of expression of this gene could underlay many of the consequences of hypothyroidism on mental states during development as well as in adult subjects. | NA |
| FAM107A | 11170 | family with sequence similarity 107 member A | ENSG00000168309 | NA | NA |
| VCAN | 1462 | versican | ENSG00000038427 | This gene is a member of the aggrecan/versican proteoglycan family. The protein encoded is a large chondroitin sulfate proteoglycan and is a major component of the extracellular matrix. This protein is involved in cell adhesion, proliferation, proliferation, migration and angiogenesis and plays a central role in tissue morphogenesis and maintenance. Mutations in this gene are the cause of Wagner syndrome type 1. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| CKB | 1152 | creatine kinase B | ENSG00000166165 | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in brain as well as in other tissues, and as a heterodimer with a similar muscle isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. A pseudogene of this gene has been characterized. | NA |
| DSTN | 11034 | destrin, actin depolymerizing factor | ENSG00000125868 | The product of this gene belongs to the actin-binding proteins ADF family. This family of proteins is responsible for enhancing the turnover rate of actin in vivo. This gene encodes the actin depolymerizing protein that severs actin filaments (F-actin) and binds to actin monomers (G-actin). Two transcript variants encoding distinct isoforms have been identified for this gene. | NA |
| PLXNB1 | 5364 | plexin B1 | ENSG00000164050 | NA | NA |
| ALDOA | 226 | aldolase, fructose-bisphosphate A | ENSG00000149925 | The protein encoded by this gene, Aldolase A (fructose-bisphosphate aldolase), is a glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Three aldolase isozymes (A, B, and C), encoded by three different genes, are differentially expressed during development. Aldolase A is found in the developing embryo and is produced in even greater amounts in adult muscle. Aldolase A expression is repressed in adult liver, kidney and intestine and similar to aldolase C levels in brain and other nervous tissue. Aldolase A deficiency has been associated with myopathy and hemolytic anemia. Alternative splicing and alternative promoter usage results in multiple transcript variants. Related pseudogenes have been identified on chromosomes 3 and 10. | NA |
| FSTL1 | 11167 | follistatin like 1 | ENSG00000163430 | This gene encodes a protein with similarity to follistatin, an activin-binding protein. It contains an FS module, a follistatin-like sequence containing 10 conserved cysteine residues. This gene product is thought to be an autoantigen associated with rheumatoid arthritis. | NA |
| MGST1 | 4257 | microsomal glutathione S-transferase 1 | ENSG00000008394 | The MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism) family consists of six human proteins, two of which are involved in the production of leukotrienes and prostaglandin E, important mediators of inflammation. Other family members, demonstrating glutathione S-transferase and peroxidase activities, are involved in cellular defense against toxic, carcinogenic, and pharmacologically active electrophilic compounds. This gene encodes a protein that catalyzes the conjugation of glutathione to electrophiles and the reduction of lipid hydroperoxides. This protein is localized to the endoplasmic reticulum and outer mitochondrial membrane where it is thought to protect these membranes from oxidative stress. Several transcript variants, some non-protein coding and some protein coding, have been found for this gene. | NA |
| GAPDH | 2597 | glyceraldehyde-3-phosphate dehydrogenase | ENSG00000111640 | This gene encodes a member of the glyceraldehyde-3-phosphate dehydrogenase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. The product of this gene catalyzes an important energy-yielding step in carbohydrate metabolism, the reversible oxidative phosphorylation of glyceraldehyde-3-phosphate in the presence of inorganic phosphate and nicotinamide adenine dinucleotide (NAD). The encoded protein has additionally been identified to have uracil DNA glycosylase activity in the nucleus. Also, this protein contains a peptide that has antimicrobial activity against E. coli, P. aeruginosa, and C. albicans. Studies of a similar protein in mouse have assigned a variety of additional functions including nitrosylation of nuclear proteins, the regulation of mRNA stability, and acting as a transferrin receptor on the cell surface of macrophage. Many pseudogenes similar to this locus are present in the human genome. Alternative splicing results in multiple transcript variants. | NA |
| MYRF | 745 | myelin regulatory factor | ENSG00000124920 | This gene encodes a transcription factor that is required for central nervous system myelination and may regulate oligodendrocyte differentiation. It is thought to act by increasing the expression of genes that effect myelin production but may also directly promote myelin gene expression. Loss of a similar gene in mouse models results in severe demyelination. Alternative splicing results in multiple transcript variants. | NA |
| SYNPO2 | 171024 | synaptopodin 2 | ENSG00000172403 | NA | NA |
| SPOCK2 | 9806 | sparc/osteonectin, cwcv and kazal-like domains proteoglycan (testican) 2 | ENSG00000107742 | This gene encodes a protein which binds with glycosaminoglycans to form part of the extracellular matrix. The protein contains thyroglobulin type-1, follistatin-like, and calcium-binding domains, and has glycosaminoglycan attachment sites in the acidic C-terminal region. Three alternatively spliced transcript variants that encode different protein isoforms have been described for this gene. | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",15,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[16,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| query | symbol | summary | X_id | name | notfound |
|---|---|---|---|---|---|
| ENSG00000042832 | TG | Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. | 7038 | thyroglobulin | NA |
| ENSG00000115705 | TPO | This gene encodes a membrane-bound glycoprotein. The encoded protein acts as an enzyme and plays a central role in thyroid gland function. The protein functions in the iodination of tyrosine residues in thyroglobulin and phenoxy-ester formation between pairs of iodinated tyrosines to generate the thyroid hormones, thyroxine and triiodothyronine. Mutations in this gene are associated with several disorders of thyroid hormonogenesis, including congenital hypothyroidism, congenital goiter, and thyroid hormone organification defect IIA. Multiple transcript variants encoding distinct isoforms have been identified for this gene, but the full-length nature of some variants has not been determined. | 7173 | thyroid peroxidase | NA |
| ENSG00000163631 | ALB | Albumin is a soluble, monomeric protein which comprises about one-half of the blood serum protein. Albumin functions primarily as a carrier protein for steroids, fatty acids, and thyroid hormones and plays a role in stabilizing extracellular fluid volume. Albumin is a globular unglycosylated serum protein of molecular weight 65,000. Albumin is synthesized in the liver as preproalbumin which has an N-terminal peptide that is removed before the nascent protein is released from the rough endoplasmic reticulum. The product, proalbumin, is in turn cleaved in the Golgi vesicles to produce the secreted albumin. | 213 | albumin | NA |
| ENSG00000125618 | PAX8 | This gene encodes a member of the paired box (PAX) family of transcription factors. Members of this gene family typically encode proteins that contain a paired box domain, an octapeptide, and a paired-type homeodomain. This nuclear protein is involved in thyroid follicular cell development and expression of thyroid-specific genes. Mutations in this gene have been associated with thyroid dysgenesis, thyroid follicular carcinomas and atypical follicular thyroid adenomas. Alternatively spliced transcript variants encoding different isoforms have been described. | 7849 | paired box 8 | NA |
| ENSG00000257017 | HP | This gene encodes a preproprotein, which is processed to yield both alpha and beta chains, which subsequently combine as a tetramer to produce haptoglobin. Haptoglobin functions to bind free plasma hemoglobin, which allows degradative enzymes to gain access to the hemoglobin, while at the same time preventing loss of iron through the kidneys and protecting the kidneys from damage by hemoglobin. Mutations in this gene and/or its regulatory regions cause ahaptoglobinemia or hypohaptoglobinemia. This gene has also been linked to diabetic nephropathy, the incidence of coronary artery disease in type 1 diabetes, Crohn’s disease, inflammatory disease behavior, primary sclerosing cholangitis, susceptibility to idiopathic Parkinson’s disease, and a reduced incidence of Plasmodium falciparum malaria. The protein encoded also exhibits antimicrobial activity against bacteria. A similar duplicated gene is located next to this gene on chromosome 16. Multiple transcript variants encoding different isoforms have been found for this gene. | 3240 | haptoglobin | NA |
| ENSG00000171560 | FGA | This gene encodes the alpha subunit of the coagulation factor fibrinogen, which is a component of the blood clot. Following vascular injury, the encoded preproprotein is proteolytically processed by thrombin during the conversion of fibrinogen to fibrin. Mutations in this gene lead to several disorders, including dysfibrinogenemia, hypofibrinogenemia, afibrinogenemia and renal amyloidosis. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. | 2243 | fibrinogen alpha chain | NA |
| ENSG00000171564 | FGB | The protein encoded by this gene is the beta component of fibrinogen, a blood-borne glycoprotein comprised of three pairs of nonidentical polypeptide chains. Following vascular injury, fibrinogen is cleaved by thrombin to form fibrin which is the most abundant component of blood clots. In addition, various cleavage products of fibrinogen and fibrin regulate cell adhesion and spreading, display vasoconstrictor and chemotactic activities, and are mitogens for several cell types. Mutations in this gene lead to several disorders, including afibrinogenemia, dysfibrinogenemia, hypodysfibrinogenemia and thrombotic tendency. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 2244 | fibrinogen beta chain | NA |
| ENSG00000125730 | C3 | Complement component C3 plays a central role in the activation of complement system. Its activation is required for both classical and alternative complement activation pathways. The encoded preproprotein is proteolytically processed to generate alpha and beta subunits that form the mature protein, which is then further processed to generate numerous peptide products. The C3a peptide, also known as the C3a anaphylatoxin, modulates inflammation and possesses antimicrobial activity. Mutations in this gene are associated with atypical hemolytic uremic syndrome and age-related macular degeneration in human patients. | 718 | complement component 3 | NA |
| ENSG00000229314 | ORM1 | This gene encodes a key acute phase plasma protein. Because of its increase due to acute inflammation, this protein is classified as an acute-phase reactant. The specific function of this protein has not yet been determined; however, it may be involved in aspects of immunosuppression. | 5004 | orosomucoid 1 | NA |
| ENSG00000107317 | PTGDS | The protein encoded by this gene is a glutathione-independent prostaglandin D synthase that catalyzes the conversion of prostaglandin H2 (PGH2) to postaglandin D2 (PGD2). PGD2 functions as a neuromodulator as well as a trophic factor in the central nervous system. PGD2 is also involved in smooth muscle contraction/relaxation and is a potent inhibitor of platelet aggregation. This gene is preferentially expressed in brain. Studies with transgenic mice overexpressing this gene suggest that this gene may be also involved in the regulation of non-rapid eye movement sleep. | 5730 | prostaglandin D2 synthase | NA |
| ENSG00000171557 | FGG | The protein encoded by this gene is the gamma component of fibrinogen, a blood-borne glycoprotein comprised of three pairs of nonidentical polypeptide chains. Following vascular injury, fibrinogen is cleaved by thrombin to form fibrin which is the most abundant component of blood clots. In addition, various cleavage products of fibrinogen and fibrin regulate cell adhesion and spreading, display vasoconstrictor and chemotactic activities, and are mitogens for several cell types. Mutations in this gene lead to several disorders, including dysfibrinogenemia, hypofibrinogenemia and thrombophilia. Alternative splicing results in transcript variants encoding different isoforms. | 2266 | fibrinogen gamma chain | NA |
| ENSG00000197971 | MBP | The protein encoded by the classic MBP gene is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. However, MBP-related transcripts are also present in the bone marrow and the immune system. These mRNAs arise from the long MBP gene (otherwise called ‘Golli-MBP’) that contains 3 additional exons located upstream of the classic MBP exons. Alternative splicing from the Golli and the MBP transcription start sites gives rise to 2 sets of MBP-related transcripts and gene products. The Golli mRNAs contain 3 exons unique to Golli-MBP, spliced in-frame to 1 or more MBP exons. They encode hybrid proteins that have N-terminal Golli aa sequence linked to MBP aa sequence. The second family of transcripts contain only MBP exons and produce the well characterized myelin basic proteins. This complex gene structure is conserved among species suggesting that the MBP transcription unit is an integral part of the Golli transcription unit and that this arrangement is important for the function and/or regulation of these genes. | 4155 | myelin basic protein | NA |
| ENSG00000132693 | CRP | The protein encoded by this gene belongs to the pentaxin family. It is involved in several host defense related functions based on its ability to recognize foreign pathogens and damaged cells of the host and to initiate their elimination by interacting with humoral and cellular effector systems in the blood. Consequently, the level of this protein in plasma increases greatly during acute phase response to tissue injury, infection, or other inflammatory stimuli. | 1401 | C-reactive protein, pentraxin-related | NA |
| ENSG00000090920 | NA | NA | NA | NA | TRUE |
| ENSG00000111275 | ALDH2 | This protein belongs to the aldehyde dehydrogenase family of proteins. Aldehyde dehydrogenase is the second enzyme of the major oxidative pathway of alcohol metabolism. Two major liver isoforms of aldehyde dehydrogenase, cytosolic and mitochondrial, can be distinguished by their electrophoretic mobilities, kinetic properties, and subcellular localizations. Most Caucasians have two major isozymes, while approximately 50% of Orientals have the cytosolic isozyme but not the mitochondrial isozyme. A remarkably higher frequency of acute alcohol intoxication among Orientals than among Caucasians could be related to the absence of a catalytically active form of the mitochondrial isozyme. The increased exposure to acetaldehyde in individuals with the catalytically inactive form may also confer greater susceptibility to many types of cancer. This gene encodes a mitochondrial isoform, which has a low Km for acetaldehydes, and is localized in mitochondrial matrix. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | 217 | aldehyde dehydrogenase 2 family (mitochondrial) | NA |
| ENSG00000164733 | CTSB | This gene encodes a member of the C1 family of peptidases. Alternative splicing of this gene results in multiple transcript variants. At least one of these variants encodes a preproprotein that is proteolytically processed to generate multiple protein products. These products include the cathepsin B light and heavy chains, which can dimerize to form the double chain form of the enzyme. This enzyme is a lysosomal cysteine protease with both endopeptidase and exopeptidase activity that may play a role in protein turnover. It is also known as amyloid precursor protein secretase and is involved in the proteolytic processing of amyloid precursor protein (APP). Incomplete proteolytic processing of APP has been suggested to be a causative factor in Alzheimer’s disease, the most common cause of dementia. Overexpression of the encoded protein has been associated with esophageal adenocarcinoma and other tumors. Multiple pseudogenes of this gene have been identified. | 1508 | cathepsin B | NA |
| ENSG00000160180 | TFF3 | Members of the trefoil family are characterized by having at least one copy of the trefoil motif, a 40-amino acid domain that contains three conserved disulfides. They are stable secretory proteins expressed in gastrointestinal mucosa. Their functions are not defined, but they may protect the mucosa from insults, stabilize the mucus layer and affect healing of the epithelium. This gene is expressed in goblet cells of the intestines and colon. This gene and two other related trefoil family member genes are found in a cluster on chromosome 21. | 7033 | trefoil factor 3 | NA |
| ENSG00000135046 | ANXA1 | This gene encodes a membrane-localized protein that binds phospholipids. This protein inhibits phospholipase A2 and has anti-inflammatory activity. Loss of function or expression of this gene has been detected in multiple tumors. | 301 | annexin A1 | NA |
| ENSG00000134531 | EMP1 | NA | 2012 | epithelial membrane protein 1 | NA |
| ENSG00000111640 | GAPDH | This gene encodes a member of the glyceraldehyde-3-phosphate dehydrogenase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. The product of this gene catalyzes an important energy-yielding step in carbohydrate metabolism, the reversible oxidative phosphorylation of glyceraldehyde-3-phosphate in the presence of inorganic phosphate and nicotinamide adenine dinucleotide (NAD). The encoded protein has additionally been identified to have uracil DNA glycosylase activity in the nucleus. Also, this protein contains a peptide that has antimicrobial activity against E. coli, P. aeruginosa, and C. albicans. Studies of a similar protein in mouse have assigned a variety of additional functions including nitrosylation of nuclear proteins, the regulation of mRNA stability, and acting as a transferrin receptor on the cell surface of macrophage. Many pseudogenes similar to this locus are present in the human genome. Alternative splicing results in multiple transcript variants. | 2597 | glyceraldehyde-3-phosphate dehydrogenase | NA |
| ENSG00000110245 | APOC3 | Apolipoprotein C-III is a very low density lipoprotein (VLDL) protein. APOC3 inhibits lipoprotein lipase and hepatic lipase; it is thought to delay catabolism of triglyceride-rich particles. The APOA1, APOC3 and APOA4 genes are closely linked in both rat and human genomes. The A-I and A-IV genes are transcribed from the same strand, while the A-1 and C-III genes are convergently transcribed. An increase in apoC-III levels induces the development of hypertriglyceridemia. | 345 | apolipoprotein C3 | NA |
| ENSG00000130600 | H19 | This gene is located in an imprinted region of chromosome 11 near the insulin-like growth factor 2 (IGF2) gene. This gene is only expressed from the maternally-inherited chromosome, whereas IGF2 is only expressed from the paternally-inherited chromosome. The product of this gene is a long non-coding RNA which functions as a tumor suppressor. Mutations in this gene have been associated with Beckwith-Wiedemann Syndrome and Wilms tumorigenesis. Alternative splicing results in multiple transcript variants. | 283120 | H19, imprinted maternally expressed transcript (non-protein coding) | NA |
| ENSG00000151726 | ACSL1 | The protein encoded by this gene is an isozyme of the long-chain fatty-acid-coenzyme A ligase family. Although differing in substrate specificity, subcellular localization, and tissue distribution, all isozymes of this family convert free long-chain fatty acids into fatty acyl-CoA esters, and thereby play a key role in lipid biosynthesis and fatty acid degradation. Several transcript variants encoding different isoforms have been found for this gene. | 2180 | acyl-CoA synthetase long-chain family member 1 | NA |
| ENSG00000117984 | CTSD | This gene encodes a member of the A1 family of peptidases. The encoded preproprotein is proteolytically processed to generate multiple protein products. These products include the cathepsin D light and heavy chains, which heterodimerize to form the mature enzyme. This enzyme exhibits pepsin-like activity and plays a role in protein turnover and in the proteolytic activation of hormones and growth factors. Mutations in this gene play a causal role in neuronal ceroid lipofuscinosis-10 and may be involved in the pathogenesis of several other diseases, including breast cancer and possibly Alzheimer’s disease. | 1509 | cathepsin D | NA |
| ENSG00000122304 | PRM2 | Protamines substitute for histones in the chromatin of sperm during the haploid phase of spermatogenesis, and are the major DNA-binding proteins in the nucleus of sperm in many vertebrates. They package the sperm DNA into a highly condensed complex in a volume less than 5% of a somatic cell nucleus. Many mammalian species have only one protamine (protamine 1); however, a few species, including human and mouse, have two. This gene encodes protamine 2, which is cleaved to give rise to a family of protamine 2 peptides. Alternatively spliced transcript variants have also been found for this gene. | 5620 | protamine 2 | NA |
| ENSG00000091583 | APOH | Apolipoprotein H has been implicated in a variety of physiologic pathways including lipoprotein metabolism, coagulation, and the production of antiphospholipid autoantibodies. APOH may be a required cofactor for anionic phospholipid binding by the antiphospholipid autoantibodies found in sera of many patients with lupus and primary antiphospholipid syndrome, but it does not seem to be required for the reactivity of antiphospholipid autoantibodies associated with infections. | 350 | apolipoprotein H | NA |
| ENSG00000185133 | INPP5J | NA | 27124 | inositol polyphosphate-5-phosphatase J | NA |
| ENSG00000018625 | ATP1A2 | The protein encoded by this gene belongs to the family of P-type cation transport ATPases, and to the subfamily of Na+/K+ -ATPases. Na+/K+ -ATPase is an integral membrane protein responsible for establishing and maintaining the electrochemical gradients of Na and K ions across the plasma membrane. These gradients are essential for osmoregulation, for sodium-coupled transport of a variety of organic and inorganic molecules, and for electrical excitability of nerve and muscle. This enzyme is composed of two subunits, a large catalytic subunit (alpha) and a smaller glycoprotein subunit (beta). The catalytic subunit of Na+/K+ -ATPase is encoded by multiple genes. This gene encodes an alpha 2 subunit. Mutations in this gene result in familial basilar or hemiplegic migraines, and in a rare syndrome known as alternating hemiplegia of childhood. | 477 | ATPase Na+/K+ transporting subunit alpha 2 | NA |
| ENSG00000170315 | UBB | This gene encodes ubiquitin, one of the most conserved proteins known. Ubiquitin has a major role in targeting cellular proteins for degradation by the 26S proteosome. It is also involved in the maintenance of chromatin structure, the regulation of gene expression, and the stress response. Ubiquitin is synthesized as a precursor protein consisting of either polyubiquitin chains or a single ubiquitin moiety fused to an unrelated protein. This gene consists of three direct repeats of the ubiquitin coding sequence with no spacer sequence. Consequently, the protein is expressed as a polyubiquitin precursor with a final amino acid after the last repeat. An aberrant form of this protein has been detected in patients with Alzheimer’s disease and Down syndrome. Pseudogenes of this gene are located on chromosomes 1, 2, 13, and 17. Alternative splicing results in multiple transcript variants. | 7314 | ubiquitin B | NA |
| ENSG00000168743 | NPNT | NA | 255743 | nephronectin | NA |
| ENSG00000026025 | VIM | This gene encodes a member of the intermediate filament family. Intermediate filamentents, along with microtubules and actin microfilaments, make up the cytoskeleton. The protein encoded by this gene is responsible for maintaining cell shape, integrity of the cytoplasm, and stabilizing cytoskeletal interactions. It is also involved in the immune response, and controls the transport of low-density lipoprotein (LDL)-derived cholesterol from a lysosome to the site of esterification. It functions as an organizer of a number of critical proteins involved in attachment, migration, and cell signaling. Mutations in this gene causes a dominant, pulverulent cataract. | 7431 | vimentin | NA |
| ENSG00000101670 | LIPG | The protein encoded by this gene has substantial phospholipase activity and may be involved in lipoprotein metabolism and vascular biology. This protein is designated a member of the TG lipase family by its sequence and characteristic lid region which provides substrate specificity for enzymes of the TG lipase family. | 9388 | lipase G, endothelial type | NA |
| ENSG00000175084 | DES | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | 1674 | desmin | NA |
| ENSG00000155657 | TTN | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | 7273 | titin | NA |
| ENSG00000135929 | CYP27A1 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This mitochondrial protein oxidizes cholesterol intermediates as part of the bile synthesis pathway. Since the conversion of cholesterol to bile acids is the major route for removing cholesterol from the body, this protein is important for overall cholesterol homeostasis. Mutations in this gene cause cerebrotendinous xanthomatosis, a rare autosomal recessive lipid storage disease. | 1593 | cytochrome P450 family 27 subfamily A member 1 | NA |
| ENSG00000106927 | AMBP | This gene encodes a complex glycoprotein secreted in plasma. The precursor is proteolytically processed into distinct functioning proteins: alpha-1-microglobulin, which belongs to the superfamily of lipocalin transport proteins and may play a role in the regulation of inflammatory processes, and bikunin, which is a urinary trypsin inhibitor belonging to the superfamily of Kunitz-type protease inhibitors and plays an important role in many physiological and pathological processes. This gene is located on chromosome 9 in a cluster of lipocalin genes. | 259 | alpha-1-microglobulin/bikunin precursor | NA |
| ENSG00000174437 | ATP2A2 | This gene encodes one of the SERCA Ca(2+)-ATPases, which are intracellular pumps located in the sarcoplasmic or endoplasmic reticula of muscle cells. This enzyme catalyzes the hydrolysis of ATP coupled with the translocation of calcium from the cytosol into the sarcoplasmic reticulum lumen, and is involved in regulation of the contraction/relaxation cycle. Mutations in this gene cause Darier-White disease, also known as keratosis follicularis, an autosomal dominant skin disorder characterized by loss of adhesion between epidermal cells and abnormal keratinization. Alternative splicing results in multiple transcript variants encoding different isoforms. | 488 | ATPase sarcoplasmic/endoplasmic reticulum Ca2+ transporting 2 | NA |
| ENSG00000073849 | ST6GAL1 | This gene encodes a member of glycosyltransferase family 29. The encoded protein is a type II membrane protein that catalyzes the transfer of sialic acid from CMP-sialic acid to galactose-containing substrates. The protein, which is normally found in the Golgi but can be proteolytically processed to a soluble form, is involved in the generation of the cell-surface carbohydrate determinants and differentiation antigens HB-6, CD75, and CD76. This gene has been incorrectly referred to as CD75. Three transcript variants encoding two different isoforms have been described. | 6480 | ST6 beta-galactoside alpha-2,6-sialyltransferase 1 | NA |
| ENSG00000115414 | FN1 | This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | 2335 | fibronectin 1 | NA |
| ENSG00000147872 | PLIN2 | The protein encoded by this gene belongs to the perilipin family, members of which coat intracellular lipid storage droplets. This protein is associated with the lipid globule surface membrane material, and maybe involved in development and maintenance of adipose tissue. However, it is not restricted to adipocytes as previously thought, but is found in a wide range of cultured cell lines, including fibroblasts, endothelial and epithelial cells, and tissues, such as lactating mammary gland, adrenal cortex, Sertoli and Leydig cells, and hepatocytes in alcoholic liver cirrhosis, suggesting that it may serve as a marker of lipid accumulation in diverse cell types and diseases. Alternatively spliced transcript variants have been found for this gene. | 123 | perilipin 2 | NA |
| ENSG00000158874 | APOA2 | This gene encodes apolipoprotein (apo-) A-II, which is the second most abundant protein of the high density lipoprotein particles. The protein is found in plasma as a monomer, homodimer, or heterodimer with apolipoprotein D. Defects in this gene may result in apolipoprotein A-II deficiency or hypercholesterolemia. | 336 | apolipoprotein A2 | NA |
| ENSG00000130707 | ASS1 | The protein encoded by this gene catalyzes the penultimate step of the arginine biosynthetic pathway. There are approximately 10 to 14 copies of this gene including the pseudogenes scattered across the human genome, among which the one located on chromosome 9 appears to be the only functional gene for argininosuccinate synthetase. Mutations in the chromosome 9 copy of this gene cause citrullinemia. Two transcript variants encoding the same protein have been found for this gene. | 445 | argininosuccinate synthase 1 | NA |
| ENSG00000095321 | CRAT | This gene encodes carnitine acetyltransferase (CRAT), which is a key enzyme in the metabolic pathway in mitochondria, peroxisomes and endoplasmic reticulum. CRAT catalyzes the reversible transfer of acyl groups from an acyl-CoA thioester to carnitine and regulates the ratio of acylCoA/CoA in the subcellular compartments. Two transcript variants encoding different isoforms have been found for this gene. | 1384 | carnitine O-acetyltransferase | NA |
| ENSG00000135480 | KRT7 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the simple epithelia lining the cavities of the internal organs and in the gland ducts and blood vessels. The genes encoding the type II cytokeratins are clustered in a region of chromosome 12q12-q13. Alternative splicing may result in several transcript variants; however, not all variants have been fully described. | 3855 | keratin 7 | NA |
| ENSG00000159069 | FBXW5 | This gene encodes a member of the F-box protein family, members of which are characterized by an approximately 40 amino acid motif, the F-box. The F-box proteins constitute one of the four subunits of ubiquitin protein ligase complex called SCFs (SKP1-cullin-F-box), which function in phosphorylation-dependent ubiquitination. The F-box proteins are divided into three classes: Fbws containing WD-40 domains, Fbls containing leucine-rich repeats, and Fbxs containing either different protein-protein interaction modules or no recognizable motifs. The protein encoded by this gene contains WD-40 domains, in addition to an F-box motif, so it belongs to the Fbw class. Alternatively spliced transcript variants encoding distinct isoforms have been identified for this gene, however, they were found to be nonsense-mediated mRNA decay (NMD) candidates, hence not represented. | 54461 | F-box and WD repeat domain containing 5 | NA |
| ENSG00000175646 | PRM1 | NA | 5619 | protamine 1 | NA |
| ENSG00000122367 | LDB3 | This gene encodes a PDZ domain-containing protein. PDZ motifs are modular protein-protein interaction domains consisting of 80-120 amino acid residues. PDZ domain-containing proteins interact with each other in cytoskeletal assembly or with other proteins involved in targeting and clustering of membrane proteins. The protein encoded by this gene interacts with alpha-actinin-2 through its N-terminal PDZ domain and with protein kinase C via its C-terminal LIM domains. The LIM domain is a cysteine-rich motif defined by 50-60 amino acids containing two zinc-binding modules. This protein also interacts with all three members of the myozenin family. Mutations in this gene have been associated with myofibrillar myopathy and dilated cardiomyopathy. Alternatively spliced transcript variants encoding different isoforms have been identified; all isoforms have N-terminal PDZ domains while only longer isoforms (1, 2 and 5) have C-terminal LIM domains. | 11155 | LIM domain binding 3 | NA |
| ENSG00000060138 | YBX3 | NA | 8531 | Y-box binding protein 3 | NA |
| ENSG00000130203 | APOE | The protein encoded by this gene is a major apoprotein of the chylomicron. It binds to a specific liver and peripheral cell receptor, and is essential for the normal catabolism of triglyceride-rich lipoprotein constituents. This gene maps to chromosome 19 in a cluster with the related apolipoprotein C1 and C2 genes. Mutations in this gene result in familial dysbetalipoproteinemia, or type III hyperlipoproteinemia (HLP III), in which increased plasma cholesterol and triglycerides are the consequence of impaired clearance of chylomicron and VLDL remnants. Alternative splicing results in multiple transcript variants. | 348 | apolipoprotein E | NA |
| ENSG00000169129 | AFAP1L2 | NA | 84632 | actin filament associated protein 1 like 2 | NA |
| ENSG00000133048 | CHI3L1 | Chitinases catalyze the hydrolysis of chitin, which is an abundant glycopolymer found in insect exoskeletons and fungal cell walls. The glycoside hydrolase 18 family of chitinases includes eight human family members. This gene encodes a glycoprotein member of the glycosyl hydrolase 18 family. The protein lacks chitinase activity and is secreted by activated macrophages, chondrocytes, neutrophils and synovial cells. The protein is thought to play a role in the process of inflammation and tissue remodeling. | 1116 | chitinase 3 like 1 | NA |
| ENSG00000101210 | EEF1A2 | This gene encodes an isoform of the alpha subunit of the elongation factor-1 complex, which is responsible for the enzymatic delivery of aminoacyl tRNAs to the ribosome. This isoform (alpha 2) is expressed in brain, heart and skeletal muscle, and the other isoform (alpha 1) is expressed in brain, placenta, lung, liver, kidney, and pancreas. This gene may be critical in the development of ovarian cancer. | 1917 | eukaryotic translation elongation factor 1 alpha 2 | NA |
| ENSG00000237973 | MTCO1P12 | NA | ENSG00000237973 | MT-CO1 pseudogene 12 | NA |
| ENSG00000128591 | FLNC | This gene encodes one of three related filamin genes, specifically gamma filamin. These filamin proteins crosslink actin filaments into orthogonal networks in cortical cytoplasm and participate in the anchoring of membrane proteins for the actin cytoskeleton. Three functional domains exist in filamin: an N-terminal filamentous actin-binding domain, a C-terminal self-association domain, and a membrane glycoprotein-binding domain. Two transcript variants encoding different isoforms have been found for this gene. | 2318 | filamin C | NA |
| ENSG00000158828 | PINK1 | This gene encodes a serine/threonine protein kinase that localizes to mitochondria. It is thought to protect cells from stress-induced mitochondrial dysfunction. Mutations in this gene cause one form of autosomal recessive early-onset Parkinson disease. | 65018 | PTEN induced putative kinase 1 | NA |
| ENSG00000166598 | HSP90B1 | This gene encodes a member of a family of adenosine triphosphate(ATP)-metabolizing molecular chaperones with roles in stabilizing and folding other proteins. The encoded protein is localized to melanosomes and the endoplasmic reticulum. Expression of this protein is associated with a variety of pathogenic states, including tumor formation. There is a microRNA gene located within the 5’ exon of this gene. There are pseudogenes for this gene on chromosomes 1 and 15. | 7184 | heat shock protein 90kDa beta family member 1 | NA |
| ENSG00000080824 | HSP90AA1 | The protein encoded by this gene is an inducible molecular chaperone that functions as a homodimer. The encoded protein aids in the proper folding of specific target proteins by use of an ATPase activity that is modulated by co-chaperones. Two transcript variants encoding different isoforms have been found for this gene. | 3320 | heat shock protein 90kDa alpha family class A member 1 | NA |
| ENSG00000115112 | TFCP2L1 | NA | 29842 | transcription factor CP2-like 1 | NA |
| ENSG00000189058 | APOD | This gene encodes a component of high density lipoprotein that has no marked similarity to other apolipoprotein sequences. It has a high degree of homology to plasma retinol-binding protein and other members of the alpha 2 microglobulin protein superfamily of carrier proteins, also known as lipocalins. This glycoprotein is closely associated with the enzyme lecithin:cholesterol acyltransferase - an enzyme involved in lipoprotein metabolism. | 347 | apolipoprotein D | NA |
| ENSG00000175265 | GOLGA8A | The Golgi apparatus, which participates in glycosylation and transport of proteins and lipids in the secretory pathway, consists of a series of stacked, flattened membrane sacs referred to as cisternae. Interactions between the Golgi and microtubules are thought to be important for the reorganization of the Golgi after it fragments during mitosis. The golgins constitute a family of proteins which are localized to the Golgi. This gene encodes a golgin which structurally resembles its family member GOLGA2, suggesting that they may share a similar function. There are many similar copies of this gene on chromosome 15. Alternative splicing results in multiple transcript variants. | 23015 | golgin A8 family member A | NA |
| ENSG00000143878 | RHOB | NA | 388 | ras homolog family member B | NA |
| ENSG00000205517 | RGL3 | NA | 57139 | ral guanine nucleotide dissociation stimulator like 3 | NA |
| ENSG00000104879 | CKM | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis and is an important serum marker for myocardial infarction. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in striated muscle as well as in other tissues, and as a heterodimer with a similar brain isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. | 1158 | creatine kinase, M-type | NA |
| ENSG00000166347 | CYB5A | The protein encoded by this gene is a membrane-bound cytochrome that reduces ferric hemoglobin (methemoglobin) to ferrous hemoglobin, which is required for stearyl-CoA-desaturase activity. Defects in this gene are a cause of type IV hereditary methemoglobinemia. Three transcript variants encoding different isoforms have been found for this gene. | 1528 | cytochrome b5 type A | NA |
| ENSG00000173641 | HSPB7 | NA | 27129 | heat shock protein family B (small) member 7 | NA |
| ENSG00000151729 | SLC25A4 | This gene is a member of the mitochondrial carrier subfamily of solute carrier protein genes. The product of this gene functions as a gated pore that translocates ADP from the cytoplasm into the mitochondrial matrix and ATP from the mitochondrial matrix into the cytoplasm. The protein forms a homodimer embedded in the inner mitochondria membrane. Mutations in this gene have been shown to result in autosomal dominant progressive external opthalmoplegia and familial hypertrophic cardiomyopathy. | 291 | solute carrier family 25 member 4 | NA |
| ENSG00000169738 | DCXR | The protein encoded by this gene acts as a homotetramer to catalyze diacetyl reductase and L-xylulose reductase reactions. The encoded protein may play a role in the uronate cycle of glucose metabolism and in the cellular osmoregulation in the proximal renal tubules. Defects in this gene are a cause of pentosuria. Two transcript variants encoding different isoforms have been found for this gene. | 51181 | dicarbonyl/L-xylulose reductase | NA |
| ENSG00000088836 | SLC4A11 | This gene encodes a voltage-regulated, electrogenic sodium-coupled borate cotransporter that is essential for borate homeostasis, cell growth and cell proliferation. Mutations in this gene have been associated with a number of endothelial corneal dystrophies including recessive corneal endothelial dystrophy 2, corneal dystrophy and perceptive deafness, and Fuchs endothelial corneal dystrophy. Multiple transcript variants encoding different isoforms have been described. | 83959 | solute carrier family 4 member 11 | NA |
| ENSG00000266844 | RP11-862L9.3 | NA | ENSG00000266844 | NA | NA |
| ENSG00000054690 | PLEKHH1 | NA | 57475 | pleckstrin homology, MyTH4 and FERM domain containing H1 | NA |
| ENSG00000265401 | RP11-138I1.4 | NA | ENSG00000265401 | NA | NA |
| ENSG00000137198 | GMPR | This gene encodes an enzyme that catalyzes the irreversible and NADPH-dependent reductive deamination of GMP to IMP. The protein also functions in the re-utilization of free intracellular bases and purine nucleosides. | 2766 | guanosine monophosphate reductase | NA |
| ENSG00000175206 | NPPA | The protein encoded by this gene belongs to the natriuretic peptide family. Natriuretic peptides are implicated in the control of extracellular fluid volume and electrolyte homeostasis. This protein is synthesized as a large precursor (containing a signal peptide), which is processed to release a peptide from the N-terminus with similarity to vasoactive peptide, cardiodilatin, and another peptide from the C-terminus with natriuretic-diuretic activity. Mutations in this gene have been associated with atrial fibrillation familial type 6. This gene is located adjacent to another member of the natriuretic family of peptides on chromosome 1. | 4878 | natriuretic peptide A | NA |
| ENSG00000010318 | PHF7 | Spermatogenesis is a complex process regulated by extracellular and intracellular factors as well as cellular interactions among interstitial cells of the testis, Sertoli cells, and germ cells. This gene is expressed in the testis in Sertoli cells but not germ cells. The protein encoded by this gene contains plant homeodomain (PHD) finger domains, also known as leukemia associated protein (LAP) domains, believed to be involved in transcriptional regulation. The protein, which localizes to the nucleus of transfected cells, has been implicated in the transcriptional regulation of spermatogenesis. Alternate splicing results in multiple transcript variants of this gene. | 51533 | PHD finger protein 7 | NA |
| ENSG00000269968 | RP5-940J5.9 | NA | ENSG00000269968 | NA | NA |
| ENSG00000182718 | ANXA2 | This gene encodes a member of the annexin family. Members of this calcium-dependent phospholipid-binding protein family play a role in the regulation of cellular growth and in signal transduction pathways. This protein functions as an autocrine factor which heightens osteoclast formation and bone resorption. This gene has three pseudogenes located on chromosomes 4, 9 and 10, respectively. Multiple alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 302 | annexin A2 | NA |
| ENSG00000198467 | TPM2 | This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 7169 | tropomyosin 2 (beta) | NA |
| ENSG00000101605 | MYOM1 | The giant protein titin, together with its associated proteins, interconnects the major structure of sarcomeres, the M bands and Z discs. The C-terminal end of the titin string extends into the M line, where it binds tightly to M-band constituents of apparent molecular masses of 190 kD (myomesin 1) and 165 kD (myomesin 2). This protein, myomesin 1, like myomesin 2, titin, and other myofibrillar proteins contains structural modules with strong homology to either fibronectin type III (motif I) or immunoglobulin C2 (motif II) domains. Myomesin 1 and myomesin 2 each have a unique N-terminal region followed by 12 modules of motif I or motif II, in the arrangement II-II-I-I-I-I-I-II-II-II-II-II. The two proteins share 50% sequence identity in this repeat-containing region. The head structure formed by these 2 proteins on one end of the titin string extends into the center of the M band. The integrating structure of the sarcomere arises from muscle-specific members of the superfamily of immunoglobulin-like proteins. Alternatively spliced transcript variants encoding different isoforms have been identified. | 8736 | myomesin 1 | NA |
| ENSG00000185100 | ADSSL1 | This gene encodes a member of the adenylosuccinate synthase family of proteins. The encoded muscle-specific enzyme plays a role in the purine nucleotide cycle by catalyzing the first step in the conversion of inosine monophosphate (IMP) to adenosine monophosphate (AMP). Mutations in this gene may cause adolescent onset distal myopathy. Alternative splicing results in multiple transcript variants. | 122622 | adenylosuccinate synthase like 1 | NA |
| ENSG00000115255 | REEP6 | NA | 92840 | receptor accessory protein 6 | NA |
| ENSG00000143549 | TPM3 | This gene encodes a member of the tropomyosin family of actin-binding proteins. Tropomyosins are dimers of coiled-coil proteins that provide stability to actin filaments and regulate access of other actin-binding proteins. Mutations in this gene result in autosomal dominant nemaline myopathy and other muscle disorders. This locus is involved in translocations with other loci, including anaplastic lymphoma receptor tyrosine kinase (ALK) and neurotrophic tyrosine kinase receptor type 1 (NTRK1), which result in the formation of fusion proteins that act as oncogenes. There are numerous pseudogenes for this gene on different chromosomes. Alternative splicing results in multiple transcript variants. | 7170 | tropomyosin 3 | NA |
| ENSG00000127884 | ECHS1 | The protein encoded by this gene functions in the second step of the mitochondrial fatty acid beta-oxidation pathway. It catalyzes the hydration of 2-trans-enoyl-coenzyme A (CoA) intermediates to L-3-hydroxyacyl-CoAs. The gene product is a member of the hydratase/isomerase superfamily. It localizes to the mitochondrial matrix. Transcript variants utilizing alternative transcription initiation sites have been described in the literature. | 1892 | enoyl-CoA hydratase, short chain, 1, mitochondrial | NA |
| ENSG00000171992 | SYNPO | Synaptopodin is an actin-associated protein that may play a role in actin-based cell shape and motility. The name synaptopodin derives from the protein’s associations with postsynaptic densities and dendritic spines and with renal podocytes (Mundel et al., 1997 [PubMed 9314539]). | 11346 | synaptopodin | NA |
| ENSG00000129538 | RNASE1 | This gene encodes a member of the pancreatic-type of secretory ribonucleases, a subset of the ribonuclease A superfamily. The encoded endonuclease cleaves internal phosphodiester RNA bonds on the 3’-side of pyrimidine bases. It prefers poly(C) as a substrate and hydrolyzes 2’,3’-cyclic nucleotides, with a pH optimum near 8.0. The encoded protein is monomeric and more commonly acts to degrade ds-RNA over ss-RNA. Alternative splicing occurs at this locus and four transcript variants encoding the same protein have been identified. | 6035 | ribonuclease A family member 1, pancreatic | NA |
| ENSG00000234745 | HLA-B | HLA-B belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. Class I molecules play a central role in the immune system by presenting peptides derived from the endoplasmic reticulum lumen. They are expressed in nearly all cells. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon 1 encodes the leader peptide, exon 2 and 3 encode the alpha1 and alpha2 domains, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region and exons 6 and 7 encode the cytoplasmic tail. Polymorphisms within exon 2 and exon 3 are responsible for the peptide binding specificity of each class one molecule. Typing for these polymorphisms is routinely done for bone marrow and kidney transplantation. Hundreds of HLA-B alleles have been described. | 3106 | major histocompatibility complex, class I, B | NA |
| ENSG00000123240 | OPTN | This gene encodes the coiled-coil containing protein optineurin. Optineurin may play a role in normal-tension glaucoma and adult-onset primary open angle glaucoma. Optineurin interacts with adenovirus E3-14.7K protein and may utilize tumor necrosis factor-alpha or Fas-ligand pathways to mediate apoptosis, inflammation or vasoconstriction. Optineurin may also function in cellular morphogenesis and membrane trafficking, vesicle trafficking, and transcription activation through its interactions with the RAB8, huntingtin, and transcription factor IIIA proteins. Alternative splicing results in multiple transcript variants encoding the same protein. | 10133 | optineurin | NA |
| ENSG00000106538 | RARRES2 | This gene encodes a secreted chemotactic protein that initiates chemotaxis via the ChemR23 G protein-coupled seven-transmembrane domain ligand. Expression of this gene is upregulated by the synthetic retinoid tazarotene and occurs in a wide variety of tissues. The active protein has several roles, including that as an adipokine and as an antimicrobial protein with activity against bacteria and fungi. | 5919 | retinoic acid receptor responder 2 | NA |
| ENSG00000143164 | DCAF6 | NA | 55827 | DDB1 and CUL4 associated factor 6 | NA |
| ENSG00000182054 | IDH2 | Isocitrate dehydrogenases catalyze the oxidative decarboxylation of isocitrate to 2-oxoglutarate. These enzymes belong to two distinct subclasses, one of which utilizes NAD(+) as the electron acceptor and the other NADP(+). Five isocitrate dehydrogenases have been reported: three NAD(+)-dependent isocitrate dehydrogenases, which localize to the mitochondrial matrix, and two NADP(+)-dependent isocitrate dehydrogenases, one of which is mitochondrial and the other predominantly cytosolic. Each NADP(+)-dependent isozyme is a homodimer. The protein encoded by this gene is the NADP(+)-dependent isocitrate dehydrogenase found in the mitochondria. It plays a role in intermediary metabolism and energy production. This protein may tightly associate or interact with the pyruvate dehydrogenase complex. Alternative splicing results in multiple transcript variants. | 3418 | isocitrate dehydrogenase (NADP(+)) 2, mitochondrial | NA |
| ENSG00000239775 | AC017116.11 | NA | ENSG00000239775 | NA | NA |
| ENSG00000106258 | CYP3A5 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. The encoded protein metabolizes drugs as well as the steroid hormones testosterone and progesterone. This gene is part of a cluster of cytochrome P450 genes on chromosome 7q21.1. Two pseudogenes of this gene have been identified within this cluster on chromosome 7. Expression of this gene is widely variable among populations, and a single nucleotide polymorphism that affects transcript splicing has been associated with susceptibility to hypertensions. Alternative splicing results in multiple transcript variants. | 1577 | cytochrome P450 family 3 subfamily A member 5 | NA |
| ENSG00000148672 | GLUD1 | This gene encodes glutamate dehydrogenase, which is a mitochondrial matrix enzyme that catalyzes the oxidative deamination of glutamate to alpha-ketoglutarate and ammonia. This enzyme has an important role in regulating amino acid-induced insulin secretion. It is allosterically activated by ADP and inhibited by GTP and ATP. Activating mutations in this gene are a common cause of congenital hyperinsulinism. Alternative splicing of this gene results in multiple transcript variants. The related glutamate dehydrogenase 2 gene on the human X-chromosome originated from this gene via retrotransposition and encodes a soluble form of glutamate dehydrogenase. Related pseudogenes have been identified on chromosomes 10, 18 and X. | 2746 | glutamate dehydrogenase 1 | NA |
| ENSG00000196091 | MYBPC1 | This gene encodes a member of the myosin-binding protein C family. Myosin-binding protein C family members are myosin-associated proteins found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The encoded protein is the slow skeletal muscle isoform of myosin-binding protein C and plays an important role in muscle contraction by recruiting muscle-type creatine kinase to myosin filaments. Mutations in this gene are associated with distal arthrogryposis type I. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 4604 | myosin binding protein C, slow type | NA |
| ENSG00000073060 | SCARB1 | The protein encoded by this gene is a plasma membrane receptor for high density lipoprotein cholesterol (HDL). The encoded protein mediates cholesterol transfer to and from HDL. In addition, this protein is a receptor for hepatitis C virus glycoprotein E2. Two transcript variants encoding different isoforms have been found for this gene. | 949 | scavenger receptor class B member 1 | NA |
| ENSG00000116171 | SCP2 | This gene encodes two proteins: sterol carrier protein X (SCPx) and sterol carrier protein 2 (SCP2), as a result of transcription initiation from 2 independently regulated promoters. The transcript initiated from the proximal promoter encodes the longer SCPx protein, and the transcript initiated from the distal promoter encodes the shorter SCP2 protein, with the 2 proteins sharing a common C-terminus. Evidence suggests that the SCPx protein is a peroxisome-associated thiolase that is involved in the oxidation of branched chain fatty acids, while the SCP2 protein is thought to be an intracellular lipid transfer protein. This gene is highly expressed in organs involved in lipid metabolism, and may play a role in Zellweger syndrome, in which cells are deficient in peroxisomes and have impaired bile acid synthesis. Alternative splicing of this gene produces multiple transcript variants, some encoding different isoforms. | 6342 | sterol carrier protein 2 | NA |
| ENSG00000175899 | A2M | Alpha-2-macroglobulin is a protease inhibitor and cytokine transporter. It inhibits many proteases, including trypsin, thrombin and collagenase. A2M is implicated in Alzheimer disease (AD) due to its ability to mediate the clearance and degradation of A-beta, the major component of beta-amyloid deposits. | 2 | alpha-2-macroglobulin | NA |
| ENSG00000086015 | MAST2 | NA | 23139 | microtubule associated serine/threonine kinase 2 | NA |
| ENSG00000149925 | ALDOA | The protein encoded by this gene, Aldolase A (fructose-bisphosphate aldolase), is a glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Three aldolase isozymes (A, B, and C), encoded by three different genes, are differentially expressed during development. Aldolase A is found in the developing embryo and is produced in even greater amounts in adult muscle. Aldolase A expression is repressed in adult liver, kidney and intestine and similar to aldolase C levels in brain and other nervous tissue. Aldolase A deficiency has been associated with myopathy and hemolytic anemia. Alternative splicing and alternative promoter usage results in multiple transcript variants. Related pseudogenes have been identified on chromosomes 3 and 10. | 226 | aldolase, fructose-bisphosphate A | NA |
| ENSG00000119938 | PPP1R3C | This gene encodes a regulatory subunit of protein phosphatase-1 (PP1). PP1 catalyzes reversible protein phosphorylation, which is important in a wide range of cellular activities: neuronal, muscular, RNA splicing, protein synthesis, cell death, and glycogen metabolism, to name just a few. By interacting with different regulatory subunits, PP1 is directed to different parts of the cell, to different substrates, or to respond to extracellular signals. | 5507 | protein phosphatase 1 regulatory subunit 3C | NA |
| ENSG00000167701 | GPT | This gene encodes cytosolic alanine aminotransaminase 1 (ALT1); also known as glutamate-pyruvate transaminase 1. This enzyme catalyzes the reversible transamination between alanine and 2-oxoglutarate to generate pyruvate and glutamate and, therefore, plays a key role in the intermediary metabolism of glucose and amino acids. Serum activity levels of this enzyme are routinely used as a biomarker of liver injury caused by drug toxicity, infection, alcohol, and steatosis. A related gene on chromosome 16 encodes a putative mitochondrial alanine aminotransaminase. | 2875 | glutamic-pyruvate transaminase (alanine aminotransferase) | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",16,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[17,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| summary | X_id | symbol | name | query | notfound |
|---|---|---|---|---|---|
| The protein encoded by this gene localizes to focal adhesions, regions of the plasma membrane where the cell attaches to the extracellular matrix. This protein crosslinks actin filaments and contains a Src homology 2 (SH2) domain, which is often found in molecules involved in signal transduction. This protein is a substrate of calpain II. Alternative splicing results in multiple transcript variants encoding different isoforms. | 7145 | TNS1 | tensin 1 | ENSG00000079308 | NA |
| NA | 8404 | SPARCL1 | SPARC like 1 | ENSG00000152583 | NA |
| This gene encodes a preproprotein that is proteolytically processed to form multiple protein products. The major encoded protein product, lactadherin, is a membrane glycoprotein that promotes phagocytosis of apoptotic cells. This protein has also been implicated in wound healing, autoimmune disease, and cancer. Lactadherin can be further processed to form a smaller cleavage product, medin, which comprises the major protein component of aortic medial amyloid (AMA). Alternative splicing results in multiple transcript variants. | 4240 | MFGE8 | milk fat globule-EGF factor 8 protein | ENSG00000140545 | NA |
| This gene encodes a member of the intermediate filament family. Intermediate filamentents, along with microtubules and actin microfilaments, make up the cytoskeleton. The protein encoded by this gene is responsible for maintaining cell shape, integrity of the cytoplasm, and stabilizing cytoskeletal interactions. It is also involved in the immune response, and controls the transport of low-density lipoprotein (LDL)-derived cholesterol from a lysosome to the site of esterification. It functions as an organizer of a number of critical proteins involved in attachment, migration, and cell signaling. Mutations in this gene causes a dominant, pulverulent cataract. | 7431 | VIM | vimentin | ENSG00000026025 | NA |
| This gene encodes a member of the regulators of G protein signaling (RGS) family. The RGS proteins are signal transduction molecules which are involved in the regulation of heterotrimeric G proteins by acting as GTPase activators. This gene is a hypoxia-inducible factor-1 dependent, hypoxia-induced gene which is involved in the induction of endothelial apoptosis. This gene is also one of three genes on chromosome 1q contributing to elevated blood pressure. Alternatively spliced transcript variants have been identified. | 8490 | RGS5 | regulator of G-protein signaling 5 | ENSG00000143248 | NA |
| The protein encoded by this gene is a leucine-rich repeat protein present in connective tissue extracellular matrix. This protein functions as a molecule anchoring basement membranes to the underlying connective tissue. This protein has been shown to bind type I collagen to basement membranes and type II collagen to cartilage. It also binds the basement membrane heparan sulfate proteoglycan perlecan. This protein is suggested to be involved in the pathogenesis of Hutchinson-Gilford progeria (HGP), which is reported to lack the binding of collagen in basement membranes and cartilage. Alternatively spliced transcript variants encoding the same protein have been observed. | 5549 | PRELP | proline and arginine rich end leucine rich repeat protein | ENSG00000188783 | NA |
| This gene belongs to the TIMP gene family. The proteins encoded by this gene family are inhibitors of the matrix metalloproteinases, a group of peptidases involved in degradation of the extracellular matrix (ECM). Expression of this gene is induced in response to mitogenic stimulation and this netrin domain-containing protein is localized to the ECM. Mutations in this gene have been associated with the autosomal dominant disorder Sorsby’s fundus dystrophy. | 7078 | TIMP3 | TIMP metallopeptidase inhibitor 3 | ENSG00000100234 | NA |
| The protein encoded by this gene is secreted and likely acts as an inhibitor of bone formation. The encoded protein is found in the organic matrix of bone and cartilage. Defects in this gene are a cause of Keutel syndrome (KS). Two transcript variants encoding different isoforms have been found for this gene. | 4256 | MGP | matrix Gla protein | ENSG00000111341 | NA |
| This gene encodes a member of the insulin-like growth factor (IGF)-binding protein (IGFBP) family. IGFBPs bind IGFs with high affinity, and regulate IGF availability in body fluids and tissues and modulate IGF binding to its receptors. This protein binds IGF-I and IGF-II with relatively low affinity, and belongs to a subfamily of low-affinity IGFBPs. It also stimulates prostacyclin production and cell adhesion. Alternatively spliced transcript variants encoding different isoforms have been described for this gene, and one variant has been associated with retinal arterial macroaneurysm (PMID:21835307). | 3490 | IGFBP7 | insulin like growth factor binding protein 7 | ENSG00000163453 | NA |
| Transglutaminases are enzymes that catalyze the crosslinking of proteins by epsilon-gamma glutamyl lysine isopeptide bonds. While the primary structure of transglutaminases is not conserved, they all have the same amino acid sequence at their active sites and their activity is calcium-dependent. The protein encoded by this gene acts as a monomer, is induced by retinoic acid, and appears to be involved in apoptosis. Finally, the encoded protein is the autoantigen implicated in celiac disease. Two transcript variants encoding different isoforms have been found for this gene. | 7052 | TGM2 | transglutaminase 2 | ENSG00000198959 | NA |
| This gene encodes a member of carboxypeptidase A protein family. The encoded protein may function as a transcriptional repressor and play a role in adipogenesis and smooth muscle cell differentiation. Studies in mice suggest that this gene functions in wound healing and abdominal wall development. Overexpression of this gene is associated with glioblastoma. | 165 | AEBP1 | AE binding protein 1 | ENSG00000106624 | NA |
| This gene encodes a protein with an N-terminal half that contains cysteine/histidine motifs and leucine zipper-like repeats, and the C-terminal half is rich in arginine and glutamate residues (RE domain) and arginine and serine residues (RS domain). This protein localizes with a speckled pattern in the nucleus, and could be involved in the formation of splicesome via the RE and RS domains. Two alternatively spliced transcript variants encoding the same protein have been found for this gene. | 51747 | LUC7L3 | LUC7 like 3 pre-mRNA splicing factor | ENSG00000108848 | NA |
| NA | 23524 | SRRM2 | serine/arginine repetitive matrix 2 | ENSG00000167978 | NA |
| The leiomodin 1 protein has a putative membrane-spanning region and 2 types of tandemly repeated blocks. The transcript is expressed in all tissues tested, with the highest levels in thyroid, eye muscle, skeletal muscle, and ovary. Increased expression of leiomodin 1 may be linked to Graves’ disease and thyroid-associated ophthalmopathy. | 25802 | LMOD1 | leiomodin 1 | ENSG00000163431 | NA |
| NA | 116983 | ACAP3 | ArfGAP with coiled-coil, ankyrin repeat and PH domains 3 | ENSG00000131584 | NA |
| This gene encodes a transcription factor involved in the induction of genes regulated by oxygen, which is induced as oxygen levels fall. The encoded protein contains a basic-helix-loop-helix domain protein dimerization domain as well as a domain found in proteins in signal transduction pathways which respond to oxygen levels. Mutations in this gene are associated with erythrocytosis familial type 4. | 2034 | EPAS1 | endothelial PAS domain protein 1 | ENSG00000116016 | NA |
| NA | 6625 | SNRNP70 | small nuclear ribonucleoprotein U1 subunit 70 | ENSG00000104852 | NA |
| The product of this gene belongs to the actin-binding proteins ADF family. This family of proteins is responsible for enhancing the turnover rate of actin in vivo. This gene encodes the actin depolymerizing protein that severs actin filaments (F-actin) and binds to actin monomers (G-actin). Two transcript variants encoding distinct isoforms have been identified for this gene. | 11034 | DSTN | destrin, actin depolymerizing factor | ENSG00000125868 | NA |
| This gene encodes a member of a subfamily of LIM domain proteins that are characterized by an N-terminal proline-rich region and three C-terminal LIM domains. The encoded protein localizes to the cell periphery in focal adhesions and may be involved in cell-cell adhesion and cell motility. This protein also shuttles through the nucleus and may function as a transcriptional co-activator. This gene is located at the junction of certain disease-related chromosomal translocations, which result in the expression of chimeric proteins that may promote tumor growth. Alternative splicing results in multiple transcript variants. | 4026 | LPP | LIM domain containing preferred translocation partner in lipoma | ENSG00000145012 | NA |
| Alpha-2-macroglobulin is a protease inhibitor and cytokine transporter. It inhibits many proteases, including trypsin, thrombin and collagenase. A2M is implicated in Alzheimer disease (AD) due to its ability to mediate the clearance and degradation of A-beta, the major component of beta-amyloid deposits. | 2 | A2M | alpha-2-macroglobulin | ENSG00000175899 | NA |
| This gene encodes a cytoskeletal LIM protein that binds to actin filaments via a domain that is homologous to erythrocyte dematin. LIM domains, found in over 60 proteins, play key roles in the regulation of developmental pathways. LIM domains also function as protein-binding interfaces, mediating specific protein-protein interactions. The protein encoded by this gene could mediate such interactions between actin filaments and cytoplasmic targets. Alternatively spliced transcript variants encoding different isoforms have been identified. | 3983 | ABLIM1 | actin binding LIM protein 1 | ENSG00000099204 | NA |
| This gene encodes a member of the polycystin protein family. The encoded glycoprotein contains a large N-terminal extracellular region, multiple transmembrane domains and a cytoplasmic C-tail. It is an integral membrane protein that functions as a regulator of calcium permeable cation channels and intracellular calcium homoeostasis. It is also involved in cell-cell/matrix interactions and may modulate G-protein-coupled signal-transduction pathways. It plays a role in renal tubular development, and mutations in this gene cause autosomal dominant polycystic kidney disease type 1 (ADPKD1). ADPKD1 is characterized by the growth of fluid-filled cysts that replace normal renal tissue and result in end-stage renal failure. Splice variants encoding different isoforms have been noted for this gene. Also, six pseudogenes, closely linked in a known duplicated region on chromosome 16p, have been described. | 5310 | PKD1 | polycystin 1, transient receptor potential channel interacting | ENSG00000008710 | NA |
| This gene encodes a type IV collagen alpha protein. Type IV collagen proteins are integral components of basement membranes. This gene shares a bidirectional promoter with a paralogous gene on the opposite strand. The protein consists of an amino-terminal 7S domain, a triple-helix forming collagenous domain, and a carboxy-terminal non-collagenous domain. It functions as part of a heterotrimer and interacts with other extracellular matrix components such as perlecans, proteoglycans, and laminins. In addition, proteolytic cleavage of the non-collagenous carboxy-terminal domain results in a biologically active fragment known as arresten, which has anti-angiogenic and tumor suppressor properties. Mutations in this gene cause porencephaly, cerebrovascular disease, and renal and muscular defects. Alternative splicing results in multiple transcript variants. | 1282 | COL4A1 | collagen type IV alpha 1 chain | ENSG00000187498 | NA |
| NA | 388 | RHOB | ras homolog family member B | ENSG00000143878 | NA |
| This gene encodes a cytoskeletal protein that is concentrated in areas of cell-substratum and cell-cell contacts. The encoded protein plays a significant role in the assembly of actin filaments and in spreading and migration of various cell types, including fibroblasts and osteoclasts. It codistributes with integrins in the cell surface membrane in order to assist in the attachment of adherent cells to extracellular matrices and of lymphocytes to other cells. The N-terminus of this protein contains elements for localization to cell-extracellular matrix junctions. The C-terminus contains binding sites for proteins such as beta-1-integrin, actin, and vinculin. | 7094 | TLN1 | talin 1 | ENSG00000137076 | NA |
| This gene encodes a member of the fibrillar collagen family, and plays a role during the calcification of cartilage and the transition of cartilage to bone. The encoded protein product is a preproprotein. It includes an N-terminal signal peptide, which is followed by an N-terminal propetide, mature peptide and a C-terminal propeptide. The N-terminal propeptide contains thrombospondin N-terminal-like and laminin G-like domains. The mature peptide is a major triple-helical region. The C-terminal propeptide, also known as COLFI domain, plays crucial roles in tissue growth and repair. Mutations in this gene cause Steel syndrome. Alternatively spliced transcript variants have been found, but the full-length nature of some variants has not been determined. | 85301 | COL27A1 | collagen type XXVII alpha 1 | ENSG00000196739 | NA |
| This gene encodes one of the six subunits of type IV collagen, the major structural component of basement membranes. The C-terminal portion of the protein, known as canstatin, is an inhibitor of angiogenesis and tumor growth. Like the other members of the type IV collagen gene family, this gene is organized in a head-to-head conformation with another type IV collagen gene so that each gene pair shares a common promoter. | 1284 | COL4A2 | collagen type IV alpha 2 | ENSG00000134871 | NA |
| Growth arrest-specific 7 is expressed primarily in terminally differentiated brain cells and predominantly in mature cerebellar Purkinje neurons. GAS7 plays a putative role in neuronal development. Several transcript variants encoding proteins which vary in the N-terminus have been described. | 8522 | GAS7 | growth arrest specific 7 | ENSG00000007237 | NA |
| The protein encoded by this gene is a transformation and shape-change sensitive actin cross-linking/gelling protein found in fibroblasts and smooth muscle. Its expression is down-regulated in many cell lines, and this down-regulation may be an early and sensitive marker for the onset of transformation. A functional role of this protein is unclear. Two transcript variants encoding the same protein have been found for this gene. | 6876 | TAGLN | transgelin | ENSG00000149591 | NA |
| This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 7169 | TPM2 | tropomyosin 2 (beta) | ENSG00000198467 | NA |
| This gene encodes a cytoskeletal protein that is required for organizing the actin cytoskeleton. The protein is a component of actin-containing microfilaments, and it is involved in the control of cell shape, adhesion, and contraction. Polymorphisms in this gene are associated with a susceptibility to pancreatic cancer type 1, and also with a risk for myocardial infarction. Alternative splicing results in multiple transcript variants. | 23022 | PALLD | palladin, cytoskeletal associated protein | ENSG00000129116 | NA |
| The Golgi apparatus, which participates in glycosylation and transport of proteins and lipids in the secretory pathway, consists of a series of stacked, flattened membrane sacs referred to as cisternae. Interactions between the Golgi and microtubules are thought to be important for the reorganization of the Golgi after it fragments during mitosis. The golgins constitute a family of proteins which are localized to the Golgi. This gene encodes a golgin which structurally resembles its family member GOLGA2, suggesting that they may share a similar function. There are many similar copies of this gene on chromosome 15. Alternative splicing results in multiple transcript variants. | 23015 | GOLGA8A | golgin A8 family member A | ENSG00000175265 | NA |
| The membrane-associated protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intracellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the ABC1 subfamily. Members of the ABC1 subfamily comprise the only major ABC subfamily found exclusively in multicellular eukaryotes. This protein is highly expressed in brain tissue and may play a role in macrophage lipid metabolism and neural development. Two transcript variants encoding different isoforms have been found for this gene. | 20 | ABCA2 | ATP binding cassette subfamily A member 2 | ENSG00000107331 | NA |
| The protein encoded by the classic MBP gene is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. However, MBP-related transcripts are also present in the bone marrow and the immune system. These mRNAs arise from the long MBP gene (otherwise called ‘Golli-MBP’) that contains 3 additional exons located upstream of the classic MBP exons. Alternative splicing from the Golli and the MBP transcription start sites gives rise to 2 sets of MBP-related transcripts and gene products. The Golli mRNAs contain 3 exons unique to Golli-MBP, spliced in-frame to 1 or more MBP exons. They encode hybrid proteins that have N-terminal Golli aa sequence linked to MBP aa sequence. The second family of transcripts contain only MBP exons and produce the well characterized myelin basic proteins. This complex gene structure is conserved among species suggesting that the MBP transcription unit is an integral part of the Golli transcription unit and that this arrangement is important for the function and/or regulation of these genes. | 4155 | MBP | myelin basic protein | ENSG00000197971 | NA |
| This gene encodes one of the three enolase isoenzymes found in mammals. This isoenzyme, a homodimer, is found in mature neurons and cells of neuronal origin. A switch from alpha enolase to gamma enolase occurs in neural tissue during development in rats and primates. | 2026 | ENO2 | enolase 2 | ENSG00000111674 | NA |
| The protein encoded by this gene is a member of the formin-binding-protein family. The protein contains an N-terminal Fer/Cdc42-interacting protein 4 (CIP4) homology (FCH) domain followed by a coiled-coil domain, a proline-rich motif, a second coiled-coil domain, a Rho family protein-binding domain (RBD), and a C-terminal SH3 domain. This protein binds sorting nexin 2 (SNX2), tankyrase (TNKS), and dynamin; an interaction between this protein and formin has not been demonstrated yet in human. | 23048 | FNBP1 | formin binding protein 1 | ENSG00000187239 | NA |
| NA | 140710 | SOGA1 | suppressor of glucose, autophagy associated 1 | ENSG00000149639 | NA |
| PPFIA4, or liprin-alpha-4, belongs to the liprin-alpha gene family. See liprin-alpha-1 (LIP1, or PPFIA1; MIM 611054) for background on liprins. | 8497 | PPFIA4 | PTPRF interacting protein alpha 4 | ENSG00000143847 | NA |
| The protein encoded by this gene is a member of the serine/arginine (SR)-rich family of pre-mRNA splicing factors, which constitute part of the spliceosome. Each of these factors contains an RNA recognition motif (RRM) for binding RNA and an RS domain for binding other proteins. The RS domain is rich in serine and arginine residues and facilitates interaction between different SR splicing factors. In addition to being critical for mRNA splicing, the SR proteins have also been shown to be involved in mRNA export from the nucleus and in translation. Alternative splicing results in multiple transcript variants. | 6430 | SRSF5 | serine and arginine rich splicing factor 5 | ENSG00000100650 | NA |
| This gene encodes a CBL-associated protein which functions in the signaling and stimulation of insulin. Mutations in this gene may be associated with human disorders of insulin resistance. Alternative splicing results in multiple transcript variants. | 10580 | SORBS1 | sorbin and SH3 domain containing 1 | ENSG00000095637 | NA |
| NA | 7089 | TLE2 | transducin like enhancer of split 2 | ENSG00000065717 | NA |
| NA | 27129 | HSPB7 | heat shock protein family B (small) member 7 | ENSG00000173641 | NA |
| NA | NA | NA | NA | ENSG00000256309 | TRUE |
| Chloride channels are a diverse group of proteins that regulate fundamental cellular processes including stabilization of cell membrane potential, transepithelial transport, maintenance of intracellular pH, and regulation of cell volume. Chloride intracellular channel 4 (CLIC4) protein, encoded by the CLIC4 gene, is a member of the p64 family; the gene is expressed in many tissues and exhibits a intracellular vesicular pattern in Panc-1 cells (pancreatic cancer cells). | 25932 | CLIC4 | chloride intracellular channel 4 | ENSG00000169504 | NA |
| A human melanoma-associated chondroitin sulfate proteoglycan plays a role in stabilizing cell-substratum interactions during early events of melanoma cell spreading on endothelial basement membranes. CSPG4 represents an integral membrane chondroitin sulfate proteoglycan expressed by human malignant melanoma cells. | 1464 | CSPG4 | chondroitin sulfate proteoglycan 4 | ENSG00000173546 | NA |
| Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a nonmuscle, cytoskeletal, alpha actinin isoform and maps to the same site as the structurally similar erythroid beta spectrin gene. Three transcript variants encoding different isoforms have been found for this gene. | 87 | ACTN1 | actinin alpha 1 | ENSG00000072110 | NA |
| This gene encodes a nonadrenergic imidazoline-1 receptor protein that localizes to the cytosol and anchors to the inner layer of the plasma membrane. The orthologous mouse protein has been shown to influence cytoskeletal organization and cell migration by binding to alpha-5-beta-1 integrin. In humans, this protein has been shown to bind to the adapter insulin receptor substrate 4 (IRS4) to mediate translocation of alpha-5 integrin from the cell membrane to endosomes. Expression of this protein was reduced in human breast cancers while its overexpression reduced tumor growth and metastasis; possibly by limiting the expression of alpha-5 integrin. In human cardiac tissue, this gene was found to affect cell growth and death while in neural tissue it affected neuronal growth and differentiation. Alternative splicing results in multiple transcript variants encoding differerent isoforms. Some isoforms lack the expected C-terminal domains of a functional imidazoline receptor. | 11188 | NISCH | nischarin | ENSG00000010322 | NA |
| NA | 25957 | PNISR | PNN interacting serine and arginine rich protein | ENSG00000132424 | NA |
| The product of this gene belongs to the Serine/Threonine protein kinase family, and to the Ca(2+)/calmodulin-dependent protein kinase subfamily. The major isoform of this gene plays a role in the calcium/calmodulin-dependent (CaM) kinase cascade by phosphorylating the downstream kinases CaMK1 and CaMK4. Protein products of this gene also phosphorylate AMP-activated protein kinase (AMPK). This gene has its strongest expression in the brain and influences signalling cascades involved with learning and memory, neuronal differentiation and migration, neurite outgrowth, and synapse formation. Alternative splicing results in multiple transcript variants encoding distinct isoforms. The identified isoforms differ in their ability to undergo autophosphorylation and to phosphorylate downstream kinases. | 10645 | CAMKK2 | calcium/calmodulin-dependent protein kinase kinase 2 | ENSG00000110931 | NA |
| This gene encodes a member of the EPS8 gene family. The encoded protein, like other members of the family, is thought to link growth factor stimulation to actin organization, generating functional redundancy in the pathways that regulate actin cytoskeletal remodeling. | 64787 | EPS8L2 | EPS8 like 2 | ENSG00000177106 | NA |
| The protein encoded by this gene belongs to the cyclin family. Through its interaction with several proteins, such as RNA polymerase II, splicing factors, and cyclin-dependent kinases, this protein functions as a regulator of the pre-mRNA splicing process, as well as in inducing apoptosis by modulating the expression of apoptotic and antiapoptotic proteins. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | 81669 | CCNL2 | cyclin L2 | ENSG00000221978 | NA |
| Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. | 7038 | TG | thyroglobulin | ENSG00000042832 | NA |
| The product of this gene belongs to the integrin alpha chain family. Integrins are heterodimeric integral membrane proteins composed of an alpha subunit and a beta subunit that function in cell surface adhesion and signaling. The encoded preproprotein is proteolytically processed to generate light and heavy chains that comprise the alpha 5 subunit. This subunit associates with the beta 1 subunit to form a fibronectin receptor. This integrin may promote tumor invasion, and higher expression of this gene may be correlated with shorter survival time in lung cancer patients. Note that the integrin alpha 5 and integrin alpha V subunits are encoded by distinct genes. | 3678 | ITGA5 | integrin subunit alpha 5 | ENSG00000161638 | NA |
| The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins and kininogens. This gene encodes a stefin that functions as an intracellular thiol protease inhibitor. The protein is able to form a dimer stabilized by noncovalent forces, inhibiting papain and cathepsins l, h and b. The protein is thought to play a role in protecting against the proteases leaking from lysosomes. Evidence indicates that mutations in this gene are responsible for the primary defects in patients with progressive myoclonic epilepsy (EPM1). | 1476 | CSTB | cystatin B | ENSG00000160213 | NA |
| This gene encodes an integral membrane protein associated with presynaptic vesicles in neuronal cells. The exact function of this protein is unclear, but studies of a similar murine protein suggest that it functions in synaptic plasticity without being required for synaptic transmission. The gene product belongs to the synaptogyrin gene family. Three alternatively spliced variants encoding three different isoforms have been identified. | 9145 | SYNGR1 | synaptogyrin 1 | ENSG00000100321 | NA |
| NA | NA | NA | NA | ENSG00000163486 | TRUE |
| The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | 4629 | MYH11 | myosin, heavy chain 11, smooth muscle | ENSG00000133392 | NA |
| PLCH2 is a member of the PLC-eta family of the phosphoinositide-specific phospholipase C (PLC) superfamily of enzymes that cleave PtdIns(4,5) P2 to generate second messengers inositol 1,4,5-trisphosphate and diacylglycerol (Zhou et al., 2005 [PubMed 16107206]). | 9651 | PLCH2 | phospholipase C eta 2 | ENSG00000149527 | NA |
| The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | 3043 | HBB | hemoglobin subunit beta | ENSG00000244734 | NA |
| NA | 9315 | NREP | neuronal regeneration related protein | ENSG00000134986 | NA |
| Guanine nucleotide dissociation stimulators (GDSs, or exchange factors), such as RALGDS, are effectors of Ras-related GTPases (see MIM 190020) that participate in signaling for a variety of cellular processes. | 5900 | RALGDS | ral guanine nucleotide dissociation stimulator | ENSG00000160271 | NA |
| Integrins are heterodimeric proteins made up of alpha and beta subunits. At least 18 alpha and 8 beta subunits have been described in mammals. Integrin family members are membrane receptors involved in cell adhesion and recognition in a variety of processes including embryogenesis, hemostasis, tissue repair, immune response and metastatic diffusion of tumor cells. This gene encodes a beta subunit. Multiple alternatively spliced transcript variants which encode different protein isoforms have been found for this gene. | 3688 | ITGB1 | integrin subunit beta 1 | ENSG00000150093 | NA |
| Synaptopodin is an actin-associated protein that may play a role in actin-based cell shape and motility. The name synaptopodin derives from the protein’s associations with postsynaptic densities and dendritic spines and with renal podocytes (Mundel et al., 1997 [PubMed 9314539]). | 11346 | SYNPO | synaptopodin | ENSG00000171992 | NA |
| This gene encodes a member of the serine proteinase inhibitor (serpin) superfamily. This member is the principal inhibitor of tissue plasminogen activator (tPA) and urokinase (uPA), and hence is an inhibitor of fibrinolysis. Defects in this gene are the cause of plasminogen activator inhibitor-1 deficiency (PAI-1 deficiency), and high concentrations of the gene product are associated with thrombophilia. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 5054 | SERPINE1 | serpin family E member 1 | ENSG00000106366 | NA |
| NA | 1153 | CIRBP | cold inducible RNA binding protein | ENSG00000099622 | NA |
| The protein encoded by this gene belongs to the family of P-type cation transport ATPases, and to the subfamily of Na+/K+ -ATPases. Na+/K+ -ATPase is an integral membrane protein responsible for establishing and maintaining the electrochemical gradients of Na and K ions across the plasma membrane. These gradients are essential for osmoregulation, for sodium-coupled transport of a variety of organic and inorganic molecules, and for electrical excitability of nerve and muscle. This enzyme is composed of two subunits, a large catalytic subunit (alpha) and a smaller glycoprotein subunit (beta). The catalytic subunit of Na+/K+ -ATPase is encoded by multiple genes. This gene encodes an alpha 1 subunit. Multiple transcript variants encoding different isoforms have been found for this gene. | 476 | ATP1A1 | ATPase Na+/K+ transporting subunit alpha 1 | ENSG00000163399 | NA |
| This gene encodes a member of the semicarbazide-sensitive amine oxidase family. Copper amine oxidases catalyze the oxidative conversion of amines to aldehydes in the presence of copper and quinone cofactor. The encoded protein is localized to the cell surface, has adhesive properties as well as monoamine oxidase activity, and may be involved in leukocyte trafficking. Alterations in levels of the encoded protein may be associated with many diseases, including diabetes mellitus. A pseudogene of this gene has been described and is located approximately 9-kb downstream on the same chromosome. Alternative splicing results in multiple transcript variants. | 8639 | AOC3 | amine oxidase, copper containing 3 | ENSG00000131471 | NA |
| The protein encoded by this gene is a member of the immunophilin protein family, which play a role in immunoregulation and basic cellular processes involving protein folding and trafficking. This encoded protein is a cis-trans prolyl isomerase that binds to the immunosuppressants FK506 and rapamycin. It is thought to mediate calcineurin inhibition. It also interacts functionally with mature hetero-oligomeric progesterone receptor complexes along with the 90 kDa heat shock protein and P23 protein. This gene has been found to have multiple polyadenylation sites. Alternative splicing results in multiple transcript variants. | 2289 | FKBP5 | FK506 binding protein 5 | ENSG00000096060 | NA |
| NA | 7074 | TIAM1 | T-cell lymphoma invasion and metastasis 1 | ENSG00000156299 | NA |
| This gene encodes a member of the hook-related protein family. Members of this family are characterized by an N-terminal potential microtubule binding domain, a central coiled-coiled and a C-terminal Hook-related domain. The encoded protein may be involved in linking organelles to microtubules. | 283234 | CCDC88B | coiled-coil domain containing 88B | ENSG00000168071 | NA |
| NA | 57185 | NIPAL3 | NIPA like domain containing 3 | ENSG00000001461 | NA |
| This gene encodes a member of the ADAMTS (a disintegrin and metalloproteinase with thrombospondin motif) protein family. Members of the family share several distinct protein modules, including a propeptide region, a metalloproteinase domain, a disintegrin-like domain, and a thrombospondin type 1 (TS) motif. Individual members of this family differ in the number of C-terminal TS motifs, and some have unique C-terminal domains. The protein encoded by this gene contains two disintegrin loops and three C-terminal TS motifs and has anti-angiogenic activity. The expression of this gene may be associated with various inflammatory processes as well as development of cancer cachexia. This gene is likely to be necessary for normal growth, fertility, and organ morphology and function. | 9510 | ADAMTS1 | ADAM metallopeptidase with thrombospondin type 1 motif 1 | ENSG00000154734 | NA |
| Members of the B class of plexins, such as PLXNB2 are transmembrane receptors that participate in axon guidance and cell migration in response to semaphorins (Perrot et al. (2002) [PubMed 12183458]). | 23654 | PLXNB2 | plexin B2 | ENSG00000196576 | NA |
| This gene product belongs to the glutathione peroxidase family, which functions in the detoxification of hydrogen peroxide. It contains a selenocysteine (Sec) residue at its active site. The selenocysteine is encoded by the UGA codon, which normally signals translation termination. The 3’ UTR of Sec-containing genes have a common stem-loop structure, the sec insertion sequence (SECIS), which is necessary for the recognition of UGA as a Sec codon rather than as a stop signal. | 2878 | GPX3 | glutathione peroxidase 3 | ENSG00000211445 | NA |
| NA | 283450 | HECTD4 | HECT domain E3 ubiquitin protein ligase 4 | ENSG00000173064 | NA |
| This gene is a member of the PDK/BCKDK protein kinase family and encodes a mitochondrial protein with a histidine kinase domain. This protein is located in the matrix of the mitrochondria and inhibits the pyruvate dehydrogenase complex by phosphorylating one of its subunits, thereby contributing to the regulation of glucose metabolism. Expression of this gene is regulated by glucocorticoids, retinoic acid and insulin. | 5166 | PDK4 | pyruvate dehydrogenase kinase 4 | ENSG00000004799 | NA |
| This gene encodes the alpha chain of type VII collagen. The type VII collagen fibril, composed of three identical alpha collagen chains, is restricted to the basement zone beneath stratified squamous epithelia. It functions as an anchoring fibril between the external epithelia and the underlying stroma. Mutations in this gene are associated with all forms of dystrophic epidermolysis bullosa. In the absence of mutations, however, an acquired form of this disease can result from an autoimmune response made to type VII collagen. | 1294 | COL7A1 | collagen type VII alpha 1 | ENSG00000114270 | NA |
| This gene encodes a transmembrane protein containing a proline-rich domain in its N-terminal half. Studies in mice suggest that it is predominantly expressed in brain and spinal cord in embryonic and postnatal stages. Mutations in this gene are associated with episodic kinesigenic dyskinesia-1. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 112476 | PRRT2 | proline rich transmembrane protein 2 | ENSG00000167371 | NA |
| NA | 266727 | MDGA1 | MAM domain containing glycosylphosphatidylinositol anchor 1 | ENSG00000112139 | NA |
| The protein encoded by this gene is a mitogen that is secreted by vascular endothelial cells. The encoded protein plays a role in chondrocyte proliferation and differentiation, cell adhesion in many cell types, and is related to platelet-derived growth factor. Certain polymorphisms in this gene have been linked with a higher incidence of systemic sclerosis. | 1490 | CTGF | connective tissue growth factor | ENSG00000118523 | NA |
| This gene encodes an extracellular matrix protein with a spatially and temporally restricted tissue distribution. This protein is homohexameric with disulfide-linked subunits, and contains multiple EGF-like and fibronectin type-III domains. It is implicated in guidance of migrating neurons as well as axons during development, synaptic plasticity, and neuronal regeneration. | 3371 | TNC | tenascin C | ENSG00000041982 | NA |
| This gene encodes a guanine nucleotide exchange factor that interacts specifically with the GTP-bound Rac1 and plays a role in the Rho/Rac signaling pathways. A variant in this gene was associated with osteoarthritis. Alternative splicing results in multiple transcript variants. | 23263 | MCF2L | MCF.2 cell line derived transforming sequence like | ENSG00000126217 | NA |
| This gene encodes a calmodulin- and actin-binding protein that plays an essential role in the regulation of smooth muscle and nonmuscle contraction. The conserved domain of this protein possesses the binding activities to Ca(2+)-calmodulin, actin, tropomyosin, myosin, and phospholipids. This protein is a potent inhibitor of the actin-tropomyosin activated myosin MgATPase, and serves as a mediating factor for Ca(2+)-dependent inhibition of smooth muscle contraction. Alternative splicing of this gene results in multiple transcript variants encoding distinct isoforms. | 800 | CALD1 | caldesmon 1 | ENSG00000122786 | NA |
| The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains and are clustered in a region on chromosome 17q21.2. | 3866 | KRT15 | keratin 15 | ENSG00000171346 | NA |
| This gene is a member of the TIMP gene family. The proteins encoded by this gene family are natural inhibitors of the matrix metalloproteinases, a group of peptidases involved in degradation of the extracellular matrix. In addition to an inhibitory role against metalloproteinases, the encoded protein has a unique role among TIMP family members in its ability to directly suppress the proliferation of endothelial cells. As a result, the encoded protein may be critical to the maintenance of tissue homeostasis by suppressing the proliferation of quiescent tissues in response to angiogenic factors, and by inhibiting protease activity in tissues undergoing remodelling of the extracellular matrix. | 7077 | TIMP2 | TIMP metallopeptidase inhibitor 2 | ENSG00000035862 | NA |
| NA | 4162 | MCAM | melanoma cell adhesion molecule | ENSG00000076706 | NA |
| This gene encodes a protein that contains several helicase family domains. Mutations in this gene have been found in some patients with the CHARGE syndrome. Two transcript variants encoding different isoforms have been found for this gene. | 55636 | CHD7 | chromodomain helicase DNA binding protein 7 | ENSG00000171316 | NA |
| This gene encodes a protein that activates the nuclear factor kappa B (NFKB1) signaling pathway. Mutations in this gene are associated with autosomal recessive distal spinal muscular atrophy. Multiple transcript variants encoding different isoforms have been found for this gene. | 57449 | PLEKHG5 | pleckstrin homology and RhoGEF domain containing G5 | ENSG00000171680 | NA |
| NA | 130733 | TMEM178A | transmembrane protein 178A | ENSG00000152154 | NA |
| The protein encoded by this gene shares similarity with the product of Drosophila syd gene, required for the functional interaction of kinesin I with axonal cargo. Studies of the similar gene in mouse suggested that this protein may interact with, and regulate the activity of numerous protein kinases of the JNK signaling pathway, and thus function as a scaffold protein in neuronal cells. The C. elegans counterpart of this gene is found to regulate synaptic vesicle transport possibly by integrating JNK signaling and kinesin-1 transport. Several alternatively spliced transcript variants of this gene have been described, but the full-length nature of some of these variants has not been determined. | 23162 | MAPK8IP3 | mitogen-activated protein kinase 8 interacting protein 3 | ENSG00000138834 | NA |
| The protein encoded by this gene belongs to the perilipin family, members of which coat intracellular lipid storage droplets. This protein is associated with the lipid globule surface membrane material, and maybe involved in development and maintenance of adipose tissue. However, it is not restricted to adipocytes as previously thought, but is found in a wide range of cultured cell lines, including fibroblasts, endothelial and epithelial cells, and tissues, such as lactating mammary gland, adrenal cortex, Sertoli and Leydig cells, and hepatocytes in alcoholic liver cirrhosis, suggesting that it may serve as a marker of lipid accumulation in diverse cell types and diseases. Alternatively spliced transcript variants have been found for this gene. | 123 | PLIN2 | perilipin 2 | ENSG00000147872 | NA |
| The protein encoded by this gene is a secreted, extracellular matrix protein containing an Arg-Gly-Asp (RGD) motif and calcium-binding EGF-like domains. It promotes adhesion of endothelial cells through interaction of integrins and the RGD motif. It is prominently expressed in developing arteries but less so in adult vessels. However, its expression is reinduced in balloon-injured vessels and atherosclerotic lesions, notably in intimal vascular smooth muscle cells and endothelial cells. Therefore, the protein encoded by this gene may play a role in vascular development and remodeling. Defects in this gene are a cause of autosomal dominant cutis laxa, autosomal recessive cutis laxa type I (CL type I), and age-related macular degeneration type 3 (ARMD3). | 10516 | FBLN5 | fibulin 5 | ENSG00000140092 | NA |
| Vinculin is a cytoskeletal protein associated with cell-cell and cell-matrix junctions, where it is thought to function as one of several interacting proteins involved in anchoring F-actin to the membrane. Defects in VCL are the cause of cardiomyopathy dilated type 1W. Dilated cardiomyopathy is a disorder characterized by ventricular dilation and impaired systolic function, resulting in congestive heart failure and arrhythmia. Multiple alternatively spliced transcript variants have been found for this gene, but the biological validity of some variants has not been determined. | 7414 | VCL | vinculin | ENSG00000035403 | NA |
| This locus encodes a heat shock protein. The encoded protein likely plays a role in smooth muscle relaxation. | 126393 | HSPB6 | heat shock protein family B (small) member 6 | ENSG00000004776 | NA |
| This gene encodes a syntaxin-binding protein. The encoded protein appears to play a role in release of neurotransmitters via regulation of syntaxin, a transmembrane attachment protein receptor. Mutations in this gene have been associated with infantile epileptic encephalopathy-4. Alternatively spliced transcript variants have been described. | 6812 | STXBP1 | syntaxin binding protein 1 | ENSG00000136854 | NA |
| Members of the perilipin family, such as PLIN4, coat intracellular lipid storage droplets (Wolins et al., 2003 [PubMed 12840023]). | 729359 | PLIN4 | perilipin 4 | ENSG00000167676 | NA |
| This gene encodes a C2H2-type zinc finger protein which acts a transcriptional repressor of genes involved in neuronal development. The encoded protein recognizes a specific sequence motif and recruits components of chromatin to target genes. Alternative splicing results in multiple transcript variants. | 10472 | ZBTB18 | zinc finger and BTB domain containing 18 | ENSG00000179456 | NA |
| This gene encodes a protein that plays a role in desmosome assembly, cell adhesion, cytoskeletal organization, and epidermal differentiation. This protein co-localizes with desmoplakin and the cytolinker protein periplakin. In general, this protein localizes to the nucleus, desmosomes, cell membrane, and cortical actin-based structures. Some isoforms of this protein also associate with microtubules. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Additional splice variants have been described but their biological validity has not been verified. | 23254 | KAZN | kazrin, periplakin interacting protein | ENSG00000189337 | NA |
| NA | 100507347 | VIM-AS1 | VIM antisense RNA 1 | ENSG00000229124 | NA |
| NA | 25989 | ULK3 | unc-51 like kinase 3 | ENSG00000140474 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",17,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[18,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| X_id | summary | name | symbol | query | notfound |
|---|---|---|---|---|---|
| 2512 | This gene encodes the light subunit of the ferritin protein. Ferritin is the major intracellular iron storage protein in prokaryotes and eukaryotes. It is composed of 24 subunits of the heavy and light ferritin chains. Variation in ferritin subunit composition may affect the rates of iron uptake and release in different tissues. A major function of ferritin is the storage of iron in a soluble and nontoxic state. Defects in this light chain ferritin gene are associated with several neurodegenerative diseases and hyperferritinemia-cataract syndrome. This gene has multiple pseudogenes. | ferritin, light polypeptide | FTL | ENSG00000087086 | NA |
| 3860 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | keratin 13 | KRT13 | ENSG00000171401 | NA |
| 8490 | This gene encodes a member of the regulators of G protein signaling (RGS) family. The RGS proteins are signal transduction molecules which are involved in the regulation of heterotrimeric G proteins by acting as GTPase activators. This gene is a hypoxia-inducible factor-1 dependent, hypoxia-induced gene which is involved in the induction of endothelial apoptosis. This gene is also one of three genes on chromosome 1q contributing to elevated blood pressure. Alternatively spliced transcript variants have been identified. | regulator of G-protein signaling 5 | RGS5 | ENSG00000143248 | NA |
| 2878 | This gene product belongs to the glutathione peroxidase family, which functions in the detoxification of hydrogen peroxide. It contains a selenocysteine (Sec) residue at its active site. The selenocysteine is encoded by the UGA codon, which normally signals translation termination. The 3’ UTR of Sec-containing genes have a common stem-loop structure, the sec insertion sequence (SECIS), which is necessary for the recognition of UGA as a Sec codon rather than as a stop signal. | glutathione peroxidase 3 | GPX3 | ENSG00000211445 | NA |
| NA | NA | NA | NA | ENSG00000117289 | TRUE |
| 2495 | This gene encodes the heavy subunit of ferritin, the major intracellular iron storage protein in prokaryotes and eukaryotes. It is composed of 24 subunits of the heavy and light ferritin chains. Variation in ferritin subunit composition may affect the rates of iron uptake and release in different tissues. A major function of ferritin is the storage of iron in a soluble and nontoxic state. Defects in ferritin proteins are associated with several neurodegenerative diseases. This gene has multiple pseudogenes. Several alternatively spliced transcript variants have been observed, but their biological validity has not been determined. | ferritin heavy chain 1 | FTH1 | ENSG00000167996 | NA |
| 59 | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | actin, alpha 2, smooth muscle, aorta | ACTA2 | ENSG00000107796 | NA |
| 3851 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | keratin 4 | KRT4 | ENSG00000170477 | NA |
| 5310 | This gene encodes a member of the polycystin protein family. The encoded glycoprotein contains a large N-terminal extracellular region, multiple transmembrane domains and a cytoplasmic C-tail. It is an integral membrane protein that functions as a regulator of calcium permeable cation channels and intracellular calcium homoeostasis. It is also involved in cell-cell/matrix interactions and may modulate G-protein-coupled signal-transduction pathways. It plays a role in renal tubular development, and mutations in this gene cause autosomal dominant polycystic kidney disease type 1 (ADPKD1). ADPKD1 is characterized by the growth of fluid-filled cysts that replace normal renal tissue and result in end-stage renal failure. Splice variants encoding different isoforms have been noted for this gene. Also, six pseudogenes, closely linked in a known duplicated region on chromosome 16p, have been described. | polycystin 1, transient receptor potential channel interacting | PKD1 | ENSG00000008710 | NA |
| 567 | This gene encodes a serum protein found in association with the major histocompatibility complex (MHC) class I heavy chain on the surface of nearly all nucleated cells. The protein has a predominantly beta-pleated sheet structure that can form amyloid fibrils in some pathological conditions. The encoded antimicrobial protein displays antibacterial activity in amniotic fluid. A mutation in this gene has been shown to result in hypercatabolic hypoproteinemia. | beta-2-microglobulin | B2M | ENSG00000166710 | NA |
| 7169 | This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | tropomyosin 2 (beta) | TPM2 | ENSG00000198467 | NA |
| 4162 | NA | melanoma cell adhesion molecule | MCAM | ENSG00000076706 | NA |
| 3858 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | keratin 10 | KRT10 | ENSG00000186395 | NA |
| 6707 | NA | small proline rich protein 3 | SPRR3 | ENSG00000163209 | NA |
| 8516 | Integrins are heterodimeric transmembrane receptor proteins that mediate numerous cellular processes including cell adhesion, cytoskeletal rearrangement, and activation of cell signaling pathways. Integrins are composed of alpha and beta subunits. This gene encodes the alpha 8 subunit of the heterodimeric integrin alpha8beta1 protein. The encoded protein is a single-pass type 1 membrane protein that contains multiple FG-GAP repeats. This repeat is predicted to fold into a beta propeller structure. This gene regulates the recruitment of mesenchymal cells into epithelial structures, mediates cell-cell interactions, and regulates neurite outgrowth of sensory and motor neurons. The integrin alpha8beta1 protein thus plays an important role in wound-healing and organogenesis. Mutations in this gene have been associated with renal hypodysplasia/aplasia-1 (RHDA1) and with several animal models of chronic kidney disease. Alternate splicing results in multiple transcript variants encoding distinct isoforms. | integrin subunit alpha 8 | ITGA8 | ENSG00000077943 | NA |
| 2006 | This gene encodes a protein that is one of the two components of elastic fibers. The encoded protein is rich in hydrophobic amino acids such as glycine and proline, which form mobile hydrophobic regions bounded by crosslinks between lysine residues. Deletions and mutations in this gene are associated with supravalvular aortic stenosis (SVAS) and autosomal dominant cutis laxa. Multiple transcript variants encoding different isoforms have been found for this gene. | elastin | ELN | ENSG00000049540 | NA |
| 4854 | This gene encodes the third discovered human homologue of the Drosophilia melanogaster type I membrane protein notch. In Drosophilia, notch interaction with its cell-bound ligands (delta, serrate) establishes an intercellular signalling pathway that plays a key role in neural development. Homologues of the notch-ligands have also been identified in human, but precise interactions between these ligands and the human notch homologues remains to be determined. Mutations in NOTCH3 have been identified as the underlying cause of cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy (CADASIL). | notch 3 | NOTCH3 | ENSG00000074181 | NA |
| 3912 | Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Laminins are composed of 3 non identical chains: laminin alpha, beta and gamma (formerly A, B1, and B2, respectively) and they form a cruciform structure consisting of 3 short arms, each formed by a different chain, and a long arm composed of all 3 chains. Each laminin chain is a multidomain protein encoded by a distinct gene. Several isoforms of each chain have been described. Different alpha, beta and gamma chain isomers combine to give rise to different heterotrimeric laminin isoforms which are designated by Arabic numerals in the order of their discovery, i.e. alpha1beta1gamma1 heterotrimer is laminin 1. The biological functions of the different chains and trimer molecules are largely unknown, but some of the chains have been shown to differ with respect to their tissue distribution, presumably reflecting diverse functions in vivo. This gene encodes the beta chain isoform laminin, beta 1. The beta 1 chain has 7 structurally distinct domains which it shares with other beta chain isomers. The C-terminal helical region containing domains I and II are separated by domain alpha, domains III and V contain several EGF-like repeats, and domains IV and VI have a globular conformation. Laminin, beta 1 is expressed in most tissues that produce basement membranes, and is one of the 3 chains constituting laminin 1, the first laminin isolated from Engelbreth-Holm-Swarm (EHS) tumor. A sequence in the beta 1 chain that is involved in cell attachment, chemotaxis, and binding to the laminin receptor was identified and shown to have the capacity to inhibit metastasis. | laminin subunit beta 1 | LAMB1 | ENSG00000091136 | NA |
| 7057 | The protein encoded by this gene is a subunit of a disulfide-linked homotrimeric protein. This protein is an adhesive glycoprotein that mediates cell-to-cell and cell-to-matrix interactions. This protein can bind to fibrinogen, fibronectin, laminin, type V collagen and integrins alpha-V/beta-1. This protein has been shown to play roles in platelet aggregation, angiogenesis, and tumorigenesis. | thrombospondin 1 | THBS1 | ENSG00000137801 | NA |
| 3043 | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | hemoglobin subunit beta | HBB | ENSG00000244734 | NA |
| 3133 | HLA-E belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. HLA-E binds a restricted subset of peptides derived from the leader peptides of other class I molecules. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon one encodes the leader peptide, exons 2 and 3 encode the alpha1 and alpha2 domains, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region, and exons 6 and 7 encode the cytoplasmic tail. | major histocompatibility complex, class I, E | HLA-E | ENSG00000204592 | NA |
| ENSG00000229732 | NA | NA | AC019349.5 | ENSG00000229732 | NA |
| 55160 | ARHGEF10L is a member of the RhoGEF family of guanine nucleotide exchange factors (GEFs) that activate Rho GTPases (Winkler et al., 2005 [PubMed 16112081]). | Rho guanine nucleotide exchange factor 10 like | ARHGEF10L | ENSG00000074964 | NA |
| 6711 | Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. This gene is one member of a family of beta-spectrin genes. The encoded protein contains an N-terminal actin-binding domain, and 17 spectrin repeats which are involved in dimer formation. Multiple transcript variants encoding different isoforms have been found for this gene. | spectrin beta, non-erythrocytic 1 | SPTBN1 | ENSG00000115306 | NA |
| 23770 | The protein encoded by this gene is a member of the immunophilin protein family, which play a role in immunoregulation and basic cellular processes involving protein folding and trafficking. Unlike the other members of the family, this encoded protein does not seem to have PPIase/rotamase activity. It may have a role in neurons associated with memory function. | FK506 binding protein 8 | FKBP8 | ENSG00000105701 | NA |
| 4856 | The protein encoded by this gene is a small secreted cysteine-rich protein and a member of the CCN family of regulatory proteins. CNN family proteins associate with the extracellular matrix and play an important role in cardiovascular and skeletal development, fibrosis and cancer development. | nephroblastoma overexpressed | NOV | ENSG00000136999 | NA |
| 4629 | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | myosin, heavy chain 11, smooth muscle | MYH11 | ENSG00000133392 | NA |
| ENSG00000180139 | NA | ACTA2 antisense RNA 1 | ACTA2-AS1 | ENSG00000180139 | NA |
| 4069 | This gene encodes human lysozyme, whose natural substrate is the bacterial cell wall peptidoglycan (cleaving the beta[1-4]glycosidic linkages between N-acetylmuramic acid and N-acetylglucosamine). Lysozyme is one of the antimicrobial agents found in human milk, and is also present in spleen, lung, kidney, white blood cells, plasma, saliva, and tears. The protein has antibacterial activity against a number of bacterial species. Missense mutations in this gene have been identified in heritable renal amyloidosis. | lysozyme | LYZ | ENSG00000090382 | NA |
| 6556 | This gene is a member of the solute carrier family 11 (proton-coupled divalent metal ion transporters) family and encodes a multi-pass membrane protein. The protein functions as a divalent transition metal (iron and manganese) transporter involved in iron metabolism and host resistance to certain pathogens. Mutations in this gene have been associated with susceptibility to infectious diseases such as tuberculosis and leprosy, and inflammatory diseases such as rheumatoid arthritis and Crohn disease. Alternatively spliced variants that encode different protein isoforms have been described but the full-length nature of only one has been determined. | solute carrier family 11 member 1 | SLC11A1 | ENSG00000018280 | NA |
| 1634 | This gene encodes a member of the small leucine-rich proteoglycan family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature protein. This protein plays a role in collagen fibril assembly. Binding of this protein to multiple cell surface receptors mediates its role in tumor suppression, including a stimulatory effect on autophagy and inflammation and an inhibitory effect on angiogenesis and tumorigenesis. This gene and the related gene biglycan are thought to be the result of a gene duplication. Mutations in this gene are associated with congenital stromal corneal dystrophy in human patients. | decorin | DCN | ENSG00000011465 | NA |
| 1293 | This gene encodes the alpha-3 chain, one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The alpha-3 chain of type VI collagen is much larger than the alpha-1 and -2 chains. This difference in size is largely due to an increase in the number of subdomains, similar to von Willebrand Factor type A domains, that are found in the amino terminal globular domain of all the alpha chains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in the type VI collagen genes are associated with Bethlem myopathy, a rare autosomal dominant proximal myopathy with early childhood onset. Mutations in this gene are also a cause of Ullrich congenital muscular dystrophy, also referred to as Ullrich scleroatonic muscular dystrophy, an autosomal recessive congenital myopathy that is more severe than Bethlem myopathy. Multiple transcript variants have been identified, but the full-length nature of only some of these variants has been described. | collagen type VI alpha 3 chain | COL6A3 | ENSG00000163359 | NA |
| 85301 | This gene encodes a member of the fibrillar collagen family, and plays a role during the calcification of cartilage and the transition of cartilage to bone. The encoded protein product is a preproprotein. It includes an N-terminal signal peptide, which is followed by an N-terminal propetide, mature peptide and a C-terminal propeptide. The N-terminal propeptide contains thrombospondin N-terminal-like and laminin G-like domains. The mature peptide is a major triple-helical region. The C-terminal propeptide, also known as COLFI domain, plays crucial roles in tissue growth and repair. Mutations in this gene cause Steel syndrome. Alternatively spliced transcript variants have been found, but the full-length nature of some variants has not been determined. | collagen type XXVII alpha 1 | COL27A1 | ENSG00000196739 | NA |
| 2194 | The enzyme encoded by this gene is a multifunctional protein. Its main function is to catalyze the synthesis of palmitate from acetyl-CoA and malonyl-CoA, in the presence of NADPH, into long-chain saturated fatty acids. In some cancer cell lines, this protein has been found to be fused with estrogen receptor-alpha (ER-alpha), in which the N-terminus of FAS is fused in-frame with the C-terminus of ER-alpha. | fatty acid synthase | FASN | ENSG00000169710 | NA |
| 60 | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | actin, beta | ACTB | ENSG00000075624 | NA |
| 5730 | The protein encoded by this gene is a glutathione-independent prostaglandin D synthase that catalyzes the conversion of prostaglandin H2 (PGH2) to postaglandin D2 (PGD2). PGD2 functions as a neuromodulator as well as a trophic factor in the central nervous system. PGD2 is also involved in smooth muscle contraction/relaxation and is a potent inhibitor of platelet aggregation. This gene is preferentially expressed in brain. Studies with transgenic mice overexpressing this gene suggest that this gene may be also involved in the regulation of non-rapid eye movement sleep. | prostaglandin D2 synthase | PTGDS | ENSG00000107317 | NA |
| 1476 | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins and kininogens. This gene encodes a stefin that functions as an intracellular thiol protease inhibitor. The protein is able to form a dimer stabilized by noncovalent forces, inhibiting papain and cathepsins l, h and b. The protein is thought to play a role in protecting against the proteases leaking from lysosomes. Evidence indicates that mutations in this gene are responsible for the primary defects in patients with progressive myoclonic epilepsy (EPM1). | cystatin B | CSTB | ENSG00000160213 | NA |
| 72 | Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | actin, gamma 2, smooth muscle, enteric | ACTG2 | ENSG00000163017 | NA |
| 100129518 | NA | uncharacterized LOC100129518 | LOC100129518 | ENSG00000112096 | NA |
| 6648 | This gene is a member of the iron/manganese superoxide dismutase family. It encodes a mitochondrial protein that forms a homotetramer and binds one manganese ion per subunit. This protein binds to the superoxide byproducts of oxidative phosphorylation and converts them to hydrogen peroxide and diatomic oxygen. Mutations in this gene have been associated with idiopathic cardiomyopathy (IDC), premature aging, sporadic motor neuron disease, and cancer. Alternative splicing of this gene results in multiple transcript variants. A related pseudogene has been identified on chromosome 1. | superoxide dismutase 2, mitochondrial | SOD2 | ENSG00000112096 | NA |
| 32 | Acetyl-CoA carboxylase (ACC) is a complex multifunctional enzyme system. ACC is a biotin-containing enzyme which catalyzes the carboxylation of acetyl-CoA to malonyl-CoA, the rate-limiting step in fatty acid synthesis. ACC-beta is thought to control fatty acid oxidation by means of the ability of malonyl-CoA to inhibit carnitine-palmitoyl-CoA transferase I, the rate-limiting step in fatty acid uptake and oxidation by mitochondria. ACC-beta may be involved in the regulation of fatty acid oxidation, rather than fatty acid biosynthesis. There is evidence for the presence of two ACC-beta isoforms. | acetyl-CoA carboxylase beta | ACACB | ENSG00000076555 | NA |
| 80781 | This gene encodes the alpha chain of type XVIII collagen. This collagen is one of the multiplexins, extracellular matrix proteins that contain multiple triple-helix domains (collagenous domains) interrupted by non-collagenous domains. A long isoform of the protein has an N-terminal domain that is homologous to the extracellular part of frizzled receptors. Proteolytic processing at several endogenous cleavage sites in the C-terminal domain results in production of endostatin, a potent antiangiogenic protein that is able to inhibit angiogenesis and tumor growth. Mutations in this gene are associated with Knobloch syndrome. The main features of this syndrome involve retinal abnormalities, so type XVIII collagen may play an important role in retinal structure and in neural tube closure. Alternative splicing results in multiple transcript variants. | collagen type XVIII alpha 1 chain | COL18A1 | ENSG00000182871 | NA |
| 2180 | The protein encoded by this gene is an isozyme of the long-chain fatty-acid-coenzyme A ligase family. Although differing in substrate specificity, subcellular localization, and tissue distribution, all isozymes of this family convert free long-chain fatty acids into fatty acyl-CoA esters, and thereby play a key role in lipid biosynthesis and fatty acid degradation. Several transcript variants encoding different isoforms have been found for this gene. | acyl-CoA synthetase long-chain family member 1 | ACSL1 | ENSG00000151726 | NA |
| 7078 | This gene belongs to the TIMP gene family. The proteins encoded by this gene family are inhibitors of the matrix metalloproteinases, a group of peptidases involved in degradation of the extracellular matrix (ECM). Expression of this gene is induced in response to mitogenic stimulation and this netrin domain-containing protein is localized to the ECM. Mutations in this gene have been associated with the autosomal dominant disorder Sorsby’s fundus dystrophy. | TIMP metallopeptidase inhibitor 3 | TIMP3 | ENSG00000100234 | NA |
| 4638 | This gene, a muscle member of the immunoglobulin gene superfamily, encodes myosin light chain kinase which is a calcium/calmodulin dependent enzyme. This kinase phosphorylates myosin regulatory light chains to facilitate myosin interaction with actin filaments to produce contractile activity. This gene encodes both smooth muscle and nonmuscle isoforms. In addition, using a separate promoter in an intron in the 3’ region, it encodes telokin, a small protein identical in sequence to the C-terminus of myosin light chain kinase, that is independently expressed in smooth muscle and functions to stabilize unphosphorylated myosin filaments. A pseudogene is located on the p arm of chromosome 3. Four transcript variants that produce four isoforms of the calcium/calmodulin dependent enzyme have been identified as well as two transcripts that produce two isoforms of telokin. Additional variants have been identified but lack full length transcripts. | myosin light chain kinase | MYLK | ENSG00000065534 | NA |
| 8497 | PPFIA4, or liprin-alpha-4, belongs to the liprin-alpha gene family. See liprin-alpha-1 (LIP1, or PPFIA1; MIM 611054) for background on liprins. | PTPRF interacting protein alpha 4 | PPFIA4 | ENSG00000143847 | NA |
| 4256 | The protein encoded by this gene is secreted and likely acts as an inhibitor of bone formation. The encoded protein is found in the organic matrix of bone and cartilage. Defects in this gene are a cause of Keutel syndrome (KS). Two transcript variants encoding different isoforms have been found for this gene. | matrix Gla protein | MGP | ENSG00000111341 | NA |
| 4642 | NA | myosin ID | MYO1D | ENSG00000176658 | NA |
| 2876 | This gene encodes a member of the glutathione peroxidase family. Glutathione peroxidase functions in the detoxification of hydrogen peroxide, and is one of the most important antioxidant enzymes in humans. This protein is one of only a few proteins known in higher vertebrates to contain selenocysteine, which occurs at the active site of glutathione peroxidase and is coded by UGA, that normally functions as a translation termination codon. In addition, this protein is characterized in a polyalanine sequence polymorphism in the N-terminal region, which includes three alleles with five, six or seven alanine (ALA) repeats in this sequence. The allele with five ALA repeats is significantly associated with breast cancer risk. Two alternatively spliced transcript variants encoding distinct isoforms have been found for this gene. | glutathione peroxidase 1 | GPX1 | ENSG00000233276 | NA |
| 4946 | The protein encoded by this gene belongs to the ornithine decarboxylase antizyme family, which plays a role in cell growth and proliferation by regulating intracellular polyamine levels. Expression of antizymes requires +1 ribosomal frameshifting, which is enhanced by high levels of polyamines. Antizymes in turn bind to and inhibit ornithine decarboxylase (ODC), the key enzyme in polyamine biosynthesis; thus, completing the auto-regulatory circuit. This gene encodes antizyme 1, the first member of the antizyme family, that has broad tissue distribution, and negatively regulates intracellular polyamine levels by binding to and targeting ODC for degradation, as well as inhibiting polyamine uptake. Antizyme 1 mRNA contains two potential in-frame AUGs; and studies in rat suggest that alternative use of the two translation initiation sites results in N-terminally distinct protein isoforms with different subcellular localization. Alternatively spliced transcript variants have also been noted for this gene. | ornithine decarboxylase antizyme 1 | OAZ1 | ENSG00000104904 | NA |
| 3315 | The protein encoded by this gene is induced by environmental stress and developmental changes. The encoded protein is involved in stress resistance and actin organization and translocates from the cytoplasm to the nucleus upon stress induction. Defects in this gene are a cause of Charcot-Marie-Tooth disease type 2F (CMT2F) and distal hereditary motor neuropathy (dHMN). | heat shock protein family B (small) member 1 | HSPB1 | ENSG00000106211 | NA |
| 9645 | NA | microtubule associated monooxygenase, calponin and LIM domain containing 2 | MICAL2 | ENSG00000133816 | NA |
| 290 | Aminopeptidase N is located in the small-intestinal and renal microvillar membrane, and also in other plasma membranes. In the small intestine aminopeptidase N plays a role in the final digestion of peptides generated from hydrolysis of proteins by gastric and pancreatic proteases. Its function in proximal tubular epithelial cells and other cell types is less clear. The large extracellular carboxyterminal domain contains a pentapeptide consensus sequence characteristic of members of the zinc-binding metalloproteinase superfamily. Sequence comparisons with known enzymes of this class showed that CD13 and aminopeptidase N are identical. The latter enzyme was thought to be involved in the metabolism of regulatory peptides by diverse cell types, including small intestinal and renal tubular epithelial cells, macrophages, granulocytes, and synaptic membranes from the CNS. Human aminopeptidase N is a receptor for one strain of human coronavirus that is an important cause of upper respiratory tract infections. Defects in this gene appear to be a cause of various types of leukemia or lymphoma. | alanyl aminopeptidase, membrane | ANPEP | ENSG00000166825 | NA |
| 266727 | NA | MAM domain containing glycosylphosphatidylinositol anchor 1 | MDGA1 | ENSG00000112139 | NA |
| 1893 | This gene encodes a soluble protein that is involved in endochondral bone formation, angiogenesis, and tumor biology. It also interacts with a variety of extracellular and structural proteins, contributing to the maintenance of skin integrity and homeostasis. Mutations in this gene are associated with lipoid proteinosis disorder (also known as hyalinosis cutis et mucosae or Urbach-Wiethe disease) that is characterized by generalized thickening of skin, mucosae and certain viscera. Alternatively spliced transcript variants encoding distinct isoforms have been described for this gene. | extracellular matrix protein 1 | ECM1 | ENSG00000143369 | NA |
| 182 | The jagged 1 protein encoded by JAG1 is the human homolog of the Drosophilia jagged protein. Human jagged 1 is the ligand for the receptor notch 1, the latter a human homolog of the Drosophilia jagged receptor notch. Mutations that alter the jagged 1 protein cause Alagille syndrome. Jagged 1 signalling through notch 1 has also been shown to play a role in hematopoiesis. | jagged 1 | JAG1 | ENSG00000101384 | NA |
| 140576 | NA | S100 calcium binding protein A16 | S100A16 | ENSG00000188643 | NA |
| 716 | This gene encodes a serine protease, which is a major constituent of the human complement subcomponent C1. C1s associates with two other complement components C1r and C1q in order to yield the first component of the serum complement system. Defects in this gene are the cause of selective C1s deficiency. | complement component 1, s subcomponent | C1S | ENSG00000182326 | NA |
| 165 | This gene encodes a member of carboxypeptidase A protein family. The encoded protein may function as a transcriptional repressor and play a role in adipogenesis and smooth muscle cell differentiation. Studies in mice suggest that this gene functions in wound healing and abdominal wall development. Overexpression of this gene is associated with glioblastoma. | AE binding protein 1 | AEBP1 | ENSG00000106624 | NA |
| 6319 | This gene encodes an enzyme involved in fatty acid biosynthesis, primarily the synthesis of oleic acid. The protein belongs to the fatty acid desaturase family and is an integral membrane protein located in the endoplasmic reticulum. Transcripts of approximately 3.9 and 5.2 kb, differing only by alternative polyadenlyation signals, have been detected. A gene encoding a similar enzyme is located on chromosome 4 and a pseudogene of this gene is located on chromosome 17. | stearoyl-CoA desaturase | SCD | ENSG00000099194 | NA |
| 2487 | The protein encoded by this gene is a secreted protein that is involved in the regulation of bone development. Defects in this gene are a cause of female-specific osteoarthritis (OA) susceptibility. | frizzled-related protein | FRZB | ENSG00000162998 | NA |
| 7074 | NA | T-cell lymphoma invasion and metastasis 1 | TIAM1 | ENSG00000156299 | NA |
| 4131 | This gene encodes a protein that belongs to the microtubule-associated protein family. The proteins of this family are thought to be involved in microtubule assembly, which is an essential step in neurogenesis. The product of this gene is a precursor polypeptide that presumably undergoes proteolytic processing to generate the final MAP1B heavy chain and LC1 light chain. Gene knockout studies of the mouse microtubule-associated protein 1B gene suggested an important role in development and function of the nervous system. | microtubule associated protein 1B | MAP1B | ENSG00000131711 | NA |
| 81 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a nonmuscle, alpha actinin isoform which is concentrated in the cytoplasm, and thought to be involved in metastatic processes. Mutations in this gene have been associated with focal and segmental glomerulosclerosis. | actinin alpha 4 | ACTN4 | ENSG00000130402 | NA |
| 308 | The protein encoded by this gene belongs to the annexin family of calcium-dependent phospholipid binding proteins some of which have been implicated in membrane-related events along exocytotic and endocytotic pathways. Annexin 5 is a phospholipase A2 and protein kinase C inhibitory protein with calcium channel activity and a potential role in cellular signal transduction, inflammation, growth and differentiation. Annexin 5 has also been described as placental anticoagulant protein I, vascular anticoagulant-alpha, endonexin II, lipocortin V, placental protein 4 and anchorin CII. The gene spans 29 kb containing 13 exons, and encodes a single transcript of approximately 1.6 kb and a protein product with a molecular weight of about 35 kDa. | annexin A5 | ANXA5 | ENSG00000164111 | NA |
| 49860 | This gene encodes a member of the ‘fused gene’ family of proteins, which contain N-terminus EF-hand domains and multiple tandem peptide repeats. The encoded protein contains two EF-hand Ca2+ binding domains in its N-terminus and two glutamine- and threonine-rich 60 amino acid repeats in its C-terminus. This gene, also known as squamous epithelial heat shock protein 53, may play a role in the mucosal/epithelial immune response and epidermal differentiation. | cornulin | CRNN | ENSG00000143536 | NA |
| 6035 | This gene encodes a member of the pancreatic-type of secretory ribonucleases, a subset of the ribonuclease A superfamily. The encoded endonuclease cleaves internal phosphodiester RNA bonds on the 3’-side of pyrimidine bases. It prefers poly(C) as a substrate and hydrolyzes 2’,3’-cyclic nucleotides, with a pH optimum near 8.0. The encoded protein is monomeric and more commonly acts to degrade ds-RNA over ss-RNA. Alternative splicing occurs at this locus and four transcript variants encoding the same protein have been identified. | ribonuclease A family member 1, pancreatic | RNASE1 | ENSG00000129538 | NA |
| 928 | This gene encodes a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Tetraspanins are cell surface glycoproteins with four transmembrane domains that form multimeric complexes with other cell surface proteins. The encoded protein functions in many cellular processes including differentiation, adhesion, and signal transduction, and expression of this gene plays a critical role in the suppression of cancer cell motility and metastasis. | CD9 molecule | CD9 | ENSG00000010278 | NA |
| 1410 | Mammalian lens crystallins are divided into alpha, beta, and gamma families. Alpha crystallins are composed of two gene products: alpha-A and alpha-B, for acidic and basic, respectively. Alpha crystallins can be induced by heat shock and are members of the small heat shock protein (HSP20) family. They act as molecular chaperones although they do not renature proteins and release them in the fashion of a true chaperone; instead they hold them in large soluble aggregates. Post-translational modifications decrease the ability to chaperone. These heterogeneous aggregates consist of 30-40 subunits; the alpha-A and alpha-B subunits have a 3:1 ratio, respectively. Two additional functions of alpha crystallins are an autokinase activity and participation in the intracellular architecture. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. Alpha-A and alpha-B gene products are differentially expressed; alpha-A is preferentially restricted to the lens and alpha-B is expressed widely in many tissues and organs. Elevated expression of alpha-B crystallin occurs in many neurological diseases; a missense mutation cosegregated in a family with a desmin-related myopathy. Alternative splicing results in multiple transcript variants. | crystallin alpha B | CRYAB | ENSG00000109846 | NA |
| 4023 | LPL encodes lipoprotein lipase, which is expressed in heart, muscle, and adipose tissue. LPL functions as a homodimer, and has the dual functions of triglyceride hydrolase and ligand/bridging factor for receptor-mediated lipoprotein uptake. Severe mutations that cause LPL deficiency result in type I hyperlipoproteinemia, while less extreme mutations in LPL are linked to many disorders of lipoprotein metabolism. | lipoprotein lipase | LPL | ENSG00000175445 | NA |
| 140710 | NA | suppressor of glucose, autophagy associated 1 | SOGA1 | ENSG00000149639 | NA |
| 5166 | This gene is a member of the PDK/BCKDK protein kinase family and encodes a mitochondrial protein with a histidine kinase domain. This protein is located in the matrix of the mitrochondria and inhibits the pyruvate dehydrogenase complex by phosphorylating one of its subunits, thereby contributing to the regulation of glucose metabolism. Expression of this gene is regulated by glucocorticoids, retinoic acid and insulin. | pyruvate dehydrogenase kinase 4 | PDK4 | ENSG00000004799 | NA |
| 4628 | This gene encodes a member of the myosin superfamily. The protein represents a conventional non-muscle myosin; it should not be confused with the unconventional myosin-10 (MYO10). Myosins are actin-dependent motor proteins with diverse functions including regulation of cytokinesis, cell motility, and cell polarity. Mutations in this gene have been associated with May-Hegglin anomaly and developmental defects in brain and heart. Multiple transcript variants encoding different isoforms have been found for this gene. | myosin, heavy chain 10, non-muscle | MYH10 | ENSG00000133026 | NA |
| 2670 | This gene encodes one of the major intermediate filament proteins of mature astrocytes. It is used as a marker to distinguish astrocytes from other glial cells during development. Mutations in this gene cause Alexander disease, a rare disorder of astrocytes in the central nervous system. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | glial fibrillary acidic protein | GFAP | ENSG00000131095 | NA |
| 2995 | Glycophorin C (GYPC) is an integral membrane glycoprotein. It is a minor species carried by human erythrocytes, but plays an important role in regulating the mechanical stability of red cells. A number of glycophorin C mutations have been described. The Gerbich and Yus phenotypes are due to deletion of exon 3 and 2, respectively. The Webb and Duch antigens, also known as glycophorin D, result from single point mutations of the glycophorin C gene. The glycophorin C protein has very little homology with glycophorins A and B. Alternate splicing results in multiple transcript variants. | glycophorin C (Gerbich blood group) | GYPC | ENSG00000136732 | NA |
| 1675 | This gene encodes a member of the S1, or chymotrypsin, family of serine peptidases. This protease catalyzes the cleavage of factor B, the rate-limiting step of the alternative pathway of complement activation. This protein also functions as an adipokine, a cell signaling protein secreted by adipocytes, which regulates insulin secretion in mice. Mutations in this gene underlie complement factor D deficiency, which is associated with recurrent bacterial meningitis infections in human patients. Alternative splicing of this gene results in multiple transcript variants. At least one of these variants encodes a preproprotein that is proteolytically processed to generate the mature protease. | complement factor D | CFD | ENSG00000197766 | NA |
| 3487 | This gene is a member of the insulin-like growth factor binding protein (IGFBP) family and encodes a protein with an IGFBP domain and a thyroglobulin type-I domain. The protein binds both insulin-like growth factors (IGFs) I and II and circulates in the plasma in both glycosylated and non-glycosylated forms. Binding of this protein prolongs the half-life of the IGFs and alters their interaction with cell surface receptors. | insulin like growth factor binding protein 4 | IGFBP4 | ENSG00000141753 | NA |
| 715 | NA | complement C1r subcomponent | C1R | ENSG00000159403 | NA |
| 752 | This gene encodes a formin-related protein. Formin-related proteins have been implicated in morphogenesis, cytokinesis, and cell polarity. An alternative splice variant has been described but its full length sequence has not been determined. | formin like 1 | FMNL1 | ENSG00000184922 | NA |
| 10398 | Myosin, a structural component of muscle, consists of two heavy chains and four light chains. The protein encoded by this gene is a myosin light chain that may regulate muscle contraction by modulating the ATPase activity of myosin heads. The encoded protein binds calcium and is activated by myosin light chain kinase. Two transcript variants encoding different isoforms have been found for this gene. | myosin light chain 9 | MYL9 | ENSG00000101335 | NA |
| 5818 | This gene encodes an adhesion protein that plays a role in the organization of adherens junctions and tight junctions in epithelial and endothelial cells. The protein is a calcium(2+)-independent cell-cell adhesion molecule that belongs to the immunoglobulin superfamily and has 3 extracellular immunoglobulin-like loops, a single transmembrane domain (in some isoforms), and a cytoplasmic region. This protein acts as a receptor for glycoprotein D (gD) of herpes simplex viruses 1 and 2 (HSV-1, HSV-2), and pseudorabies virus (PRV) and mediates viral entry into epithelial and neuronal cells. Mutations in this gene cause cleft lip and palate/ectodermal dysplasia 1 syndrome (CLPED1) as well as non-syndromic cleft lip with or without cleft palate (CL/P). Alternative splicing results in multiple transcript variants encoding proteins with distinct C-termini. | nectin cell adhesion molecule 1 | NECTIN1 | ENSG00000110400 | NA |
| 22898 | NA | DENN domain containing 3 | DENND3 | ENSG00000105339 | NA |
| 5420 | This gene encodes a member of the sialomucin protein family. The encoded protein was originally identified as an important component of glomerular podocytes. Podocytes are highly differentiated epithelial cells with interdigitating foot processes covering the outer aspect of the glomerular basement membrane. Other biological activities of the encoded protein include: binding in a membrane protein complex with Na+/H+ exchanger regulatory factor to intracellular cytoskeletal elements, playing a role in hematopoetic cell differentiation, and being expressed in vascular endothelium cells and binding to L-selectin. | podocalyxin like | PODXL | ENSG00000128567 | NA |
| 3911 | This gene encodes one of the vertebrate laminin alpha chains. Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Laminins are composed of 3 non identical chains: laminin alpha, beta and gamma (formerly A, B1, and B2, respectively) and they form a cruciform structure consisting of 3 short arms, each formed by a different chain, and a long arm composed of all 3 chains. Each laminin chain is a multidomain protein encoded by a distinct gene. The protein encoded by this gene is the alpha-5 subunit of of laminin-10 (laminin-511), laminin-11 (laminin-521) and laminin-15 (laminin-523). | laminin subunit alpha 5 | LAMA5 | ENSG00000130702 | NA |
| NA | NA | NA | NA | ENSG00000259716 | TRUE |
| 3861 | This gene encodes a member of the keratin family, the most diverse group of intermediate filaments. This gene product, a type I keratin, is usually found as a heterotetramer with two keratin 5 molecules, a type II keratin. Together they form the cytoskeleton of epithelial cells. Mutations in the genes for these keratins are associated with epidermolysis bullosa simplex. At least one pseudogene has been identified at 17p12-p11. | keratin 14 | KRT14 | ENSG00000186847 | NA |
| 58 | The product encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Mutations in this gene cause nemaline myopathy type 3, congenital myopathy with excess of thin myofilaments, congenital myopathy with cores, and congenital myopathy with fiber-type disproportion, diseases that lead to muscle fiber defects. | actin, alpha 1, skeletal muscle | ACTA1 | ENSG00000143632 | NA |
| 1535 | Cytochrome b is comprised of a light chain (alpha) and a heavy chain (beta). This gene encodes the light, alpha subunit which has been proposed as a primary component of the microbicidal oxidase system of phagocytes. Mutations in this gene are associated with autosomal recessive chronic granulomatous disease (CGD), that is characterized by the failure of activated phagocytes to generate superoxide, which is important for the microbicidal activity of these cells. | cytochrome b-245 alpha chain | CYBA | ENSG00000051523 | NA |
| 64855 | NA | family with sequence similarity 129 member B | FAM129B | ENSG00000136830 | NA |
| 125 | The protein encoded by this gene is a member of the alcohol dehydrogenase family. Members of this enzyme family metabolize a wide variety of substrates, including ethanol, retinol, other aliphatic alcohols, hydroxysteroids, and lipid peroxidation products. This encoded protein, consisting of several homo- and heterodimers of alpha, beta, and gamma subunits, exhibits high activity for ethanol oxidation and plays a major role in ethanol catabolism. Three genes encoding alpha, beta and gamma subunits are tandemly organized in a genomic segment as a gene cluster. Two transcript variants encoding different isoforms have been found for this gene. | alcohol dehydrogenase 1B (class I), beta polypeptide | ADH1B | ENSG00000196616 | NA |
| 158471 | The protein encoded by this gene belongs to the B-cell CLL/lymphoma 2 and adenovirus E1B 19 kDa interacting family, whose members play roles in many cellular processes including apotosis, cell transformation, and synaptic function. Several functions for this protein have been demonstrated including suppression of Ras homolog family member A activity, which results in reduced stress fiber formation and suppression of oncogenic cellular transformation. A high molecular weight isoform of this protein has also been shown to colocalize with Adaptor protein complex 2, beta-Adaptin and endodermal markers, suggesting an involvement in post-endocytic trafficking. In prostate cancer cells, this gene acts as a tumor suppressor and its expression is regulated by prostate cancer antigen 3, a non-protein coding gene on the opposite DNA strand in an intron of this gene. Prostate cancer antigen 3 regulates levels of this gene through formation of a double-stranded RNA that undergoes adenosine deaminase actin on RNA-dependent adenosine-to-inosine RNA editing. Alternative splicing results in multiple transcript variants. | prune homolog 2 | PRUNE2 | ENSG00000106772 | NA |
| 4878 | The protein encoded by this gene belongs to the natriuretic peptide family. Natriuretic peptides are implicated in the control of extracellular fluid volume and electrolyte homeostasis. This protein is synthesized as a large precursor (containing a signal peptide), which is processed to release a peptide from the N-terminus with similarity to vasoactive peptide, cardiodilatin, and another peptide from the C-terminus with natriuretic-diuretic activity. Mutations in this gene have been associated with atrial fibrillation familial type 6. This gene is located adjacent to another member of the natriuretic family of peptides on chromosome 1. | natriuretic peptide A | NPPA | ENSG00000175206 | NA |
| 6497 | This gene encodes the nuclear protooncogene protein homolog of avian sarcoma viral (v-ski) oncogene. It functions as a repressor of TGF-beta signaling, and may play a role in neural tube development and muscle differentiation. | SKI proto-oncogene | SKI | ENSG00000157933 | NA |
| 54751 | This gene encodes a protein with an N-terminal filamin-binding domain, a central proline-rich domain, and, multiple C-terminal LIM domains. This protein localizes at cell junctions and may link cell adhesion structures to the actin cytoskeleton. This protein may be involved in the assembly and stabilization of actin-filaments and likely plays a role in modulating cell adhesion, cell morphology and cell motility. This protein also localizes to the nucleus and may affect cardiomyocyte differentiation after binding with the CSX/NKX2-5 transcription factor. Alternative splicing results in multiple transcript variants encoding different isoforms. | filamin binding LIM protein 1 | FBLIM1 | ENSG00000162458 | NA |
| 4060 | This gene encodes a member of the small leucine-rich proteoglycan (SLRP) family that includes decorin, biglycan, fibromodulin, keratocan, epiphycan, and osteoglycin. In these bifunctional molecules, the protein moiety binds collagen fibrils and the highly charged hydrophilic glycosaminoglycans regulate interfibrillar spacings. Lumican is the major keratan sulfate proteoglycan of the cornea but is also distributed in interstitial collagenous matrices throughout the body. Lumican may regulate collagen fibril organization and circumferential growth, corneal transparency, and epithelial cell migration and tissue repair. | lumican | LUM | ENSG00000139329 | NA |
| 27295 | The protein encoded by this gene contains a PDZ domain and a LIM domain, indicating that it may be involved in cytoskeletal assembly. In support of this, the encoded protein has been shown to bind the spectrin-like repeats of alpha-actinin-2 and to colocalize with alpha-actinin-2 at the Z lines of skeletal muscle. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. Aberrant alternative splicing of this gene may play a role in myotonic dystrophy. | PDZ and LIM domain 3 | PDLIM3 | ENSG00000154553 | NA |
| 27245 | This gene encodes a protein containing two AT-hooks, which likely function in DNA binding. Mutations in this gene were found in individuals with Xia-Gibbs syndrome. | AT-hook DNA binding motif containing 1 | AHDC1 | ENSG00000126705 | NA |
| 80709 | NA | AT-hook transcription factor | AKNA | ENSG00000106948 | NA |
| 8519 | NA | interferon induced transmembrane protein 1 | IFITM1 | ENSG00000185885 | NA |
| 23129 | NA | plexin D1 | PLXND1 | ENSG00000004399 | NA |
| 718 | Complement component C3 plays a central role in the activation of complement system. Its activation is required for both classical and alternative complement activation pathways. The encoded preproprotein is proteolytically processed to generate alpha and beta subunits that form the mature protein, which is then further processed to generate numerous peptide products. The C3a peptide, also known as the C3a anaphylatoxin, modulates inflammation and possesses antimicrobial activity. Mutations in this gene are associated with atypical hemolytic uremic syndrome and age-related macular degeneration in human patients. | complement component 3 | C3 | ENSG00000125730 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",18,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[19,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| name | summary | X_id | query | symbol |
|---|---|---|---|---|
| actin, alpha 1, skeletal muscle | The product encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Mutations in this gene cause nemaline myopathy type 3, congenital myopathy with excess of thin myofilaments, congenital myopathy with cores, and congenital myopathy with fiber-type disproportion, diseases that lead to muscle fiber defects. | 58 | ENSG00000143632 | ACTA1 |
| creatine kinase, M-type | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis and is an important serum marker for myocardial infarction. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in striated muscle as well as in other tissues, and as a heterodimer with a similar brain isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. | 1158 | ENSG00000104879 | CKM |
| titin | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | 7273 | ENSG00000155657 | TTN |
| nebulin | This gene encodes nebulin, a giant protein component of the cytoskeletal matrix that coexists with the thick and thin filaments within the sarcomeres of skeletal muscle. In most vertebrates, nebulin accounts for 3 to 4% of the total myofibrillar protein. The encoded protein contains approximately 30-amino acid long modules that can be classified into 7 types and other repeated modules. Protein isoform sizes vary from 600 to 800 kD due to alternative splicing that is tissue-, species-,and developmental stage-specific. Of the 183 exons in the nebulin gene, at least 43 are alternatively spliced, although exons 143 and 144 are not found in the same transcript. Of the several thousand transcript variants predicted for nebulin, the RefSeq Project has decided to create three representative RefSeq records. Mutations in this gene are associated with recessive nemaline myopathy. | 4703 | ENSG00000183091 | NEB |
| myosin binding protein C, slow type | This gene encodes a member of the myosin-binding protein C family. Myosin-binding protein C family members are myosin-associated proteins found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The encoded protein is the slow skeletal muscle isoform of myosin-binding protein C and plays an important role in muscle contraction by recruiting muscle-type creatine kinase to myosin filaments. Mutations in this gene are associated with distal arthrogryposis type I. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 4604 | ENSG00000196091 | MYBPC1 |
| myosin, heavy chain 7, cardiac muscle, beta | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | 4625 | ENSG00000092054 | MYH7 |
| myosin, heavy chain 1, skeletal muscle, adult | Myosin is a major contractile protein which converts chemical energy into mechanical energy through the hydrolysis of ATP. Myosin is a hexameric protein composed of a pair of myosin heavy chains (MYH) and two pairs of nonidentical light chains. Myosin heavy chains are encoded by a multigene family. In mammals at least 10 different myosin heavy chain (MYH) isoforms have been described from striated, smooth, and nonmuscle cells. These isoforms show expression that is spatially and temporally regulated during development. | 4619 | ENSG00000109061 | MYH1 |
| myosin light chain 2 | Thus gene encodes the regulatory light chain associated with cardiac myosin beta (or slow) heavy chain. Ca+ triggers the phosphorylation of regulatory light chain that in turn triggers contraction. Mutations in this gene are associated with mid-left ventricular chamber type hypertrophic cardiomyopathy. | 4633 | ENSG00000111245 | MYL2 |
| myosin, heavy chain 2, skeletal muscle, adult | Myosins are actin-based motor proteins that function in the generation of mechanical force in eukaryotic cells. Muscle myosins are heterohexamers composed of 2 myosin heavy chains and 2 pairs of nonidentical myosin light chains. This gene encodes a member of the class II or conventional myosin heavy chains, and functions in skeletal muscle contraction. This gene is found in a cluster of myosin heavy chain genes on chromosome 17. A mutation in this gene results in inclusion body myopathy-3. Multiple alternatively spliced variants, encoding the same protein, have been identified. | 4620 | ENSG00000125414 | MYH2 |
| troponin T1, slow skeletal type | This gene encodes a protein that is a subunit of troponin, which is a regulatory complex located on the thin filament of the sarcomere. This complex regulates striated muscle contraction in response to fluctuations in intracellular calcium concentration. This complex is composed of three subunits: troponin C, which binds calcium, troponin T, which binds tropomyosin, and troponin I, which is an inhibitory subunit. This protein is the slow skeletal troponin T subunit. Mutations in this gene cause nemaline myopathy type 5, also known as Amish nemaline myopathy, a neuromuscular disorder characterized by muscle weakness and rod-shaped, or nemaline, inclusions in skeletal muscle fibers which affects infants, resulting in death due to respiratory insufficiency, usually in the second year. Multiple transcript variants encoding different isoforms have been found for this gene. | 7138 | ENSG00000105048 | TNNT1 |
| actin, beta | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | 60 | ENSG00000075624 | ACTB |
| natriuretic peptide A | The protein encoded by this gene belongs to the natriuretic peptide family. Natriuretic peptides are implicated in the control of extracellular fluid volume and electrolyte homeostasis. This protein is synthesized as a large precursor (containing a signal peptide), which is processed to release a peptide from the N-terminus with similarity to vasoactive peptide, cardiodilatin, and another peptide from the C-terminus with natriuretic-diuretic activity. Mutations in this gene have been associated with atrial fibrillation familial type 6. This gene is located adjacent to another member of the natriuretic family of peptides on chromosome 1. | 4878 | ENSG00000175206 | NPPA |
| troponin C2, fast skeletal type | Troponin (Tn), a key protein complex in the regulation of striated muscle contraction, is composed of 3 subunits. The Tn-I subunit inhibits actomyosin ATPase, the Tn-T subunit binds tropomyosin and Tn-C, while the Tn-C subunit binds calcium and overcomes the inhibitory action of the troponin complex on actin filaments. The protein encoded by this gene is the Tn-C subunit. | 7125 | ENSG00000101470 | TNNC2 |
| phosphorylase, glycogen, muscle | This gene encodes a muscle enzyme involved in glycogenolysis. Highly similar enzymes encoded by different genes are found in liver and brain. Mutations in this gene are associated with McArdle disease (myophosphorylase deficiency), a glycogen storage disease of muscle. Alternative splicing results in multiple transcript variants. | 5837 | ENSG00000068976 | PYGM |
| troponin I1, slow skeletal type | Troponin proteins associate with tropomyosin and regulate the calcium sensitivity of the myofibril contractile apparatus of striated muscles. Troponin I (TnI), along with troponin T (TnT) and troponin C (TnC), is one of 3 subunits that form the troponin complex of the thin filaments of striated muscle. TnI is the inhibitory subunit; blocking actin-myosin interactions and thereby mediating striated muscle relaxation. The TnI subfamily contains three genes: TnI-skeletal-fast-twitch, TnI-skeletal-slow-twitch, and TnI-cardiac. The TnI-fast and TnI-slow genes are expressed in fast-twitch and slow-twitch skeletal muscle fibers, respectively, while the TnI-cardiac gene is expressed exclusively in cardiac muscle tissue. This gene encodes the Troponin-I-skeletal-slow-twitch protein. This gene is expressed in cardiac and skeletal muscle during early development but is restricted to slow-twitch skeletal muscle fibers in adults. The encoded protein prevents muscle contraction by inhibiting calcium-mediated conformational changes in actin-myosin complexes. | 7135 | ENSG00000159173 | TNNI1 |
| ATPase sarcoplasmic/endoplasmic reticulum Ca2+ transporting 1 | This gene encodes one of the SERCA Ca(2+)-ATPases, which are intracellular pumps located in the sarcoplasmic or endoplasmic reticula of muscle cells. This enzyme catalyzes the hydrolysis of ATP coupled with the translocation of calcium from the cytosol to the sarcoplasmic reticulum lumen, and is involved in muscular excitation and contraction. Mutations in this gene cause some autosomal recessive forms of Brody disease, characterized by increasing impairment of muscular relaxation during exercise. Alternative splicing results in three transcript variants encoding different isoforms. | 487 | ENSG00000196296 | ATP2A1 |
| ryanodine receptor 1 | This gene encodes a ryanodine receptor found in skeletal muscle. The encoded protein functions as a calcium release channel in the sarcoplasmic reticulum but also serves to connect the sarcoplasmic reticulum and transverse tubule. Mutations in this gene are associated with malignant hyperthermia susceptibility, central core disease, and minicore myopathy with external ophthalmoplegia. Alternatively spliced transcripts encoding different isoforms have been described. | 6261 | ENSG00000196218 | RYR1 |
| cardiomyopathy associated 5 | NA | 202333 | ENSG00000164309 | CMYA5 |
| myosin light chain 1 | Myosin is a hexameric ATPase cellular motor protein. It is composed of two heavy chains, two nonphosphorylatable alkali light chains, and two phosphorylatable regulatory light chains. This gene encodes a myosin alkali light chain expressed in fast skeletal muscle. Two transcript variants have been identified for this gene. | 4632 | ENSG00000168530 | MYL1 |
| hemoglobin subunit beta | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | 3043 | ENSG00000244734 | HBB |
| carbonic anhydrase 3 | Carbonic anhydrase III (CAIII) is a member of a multigene family (at least six separate genes are known) that encodes carbonic anhydrase isozymes. These carbonic anhydrases are a class of metalloenzymes that catalyze the reversible hydration of carbon dioxide and are differentially expressed in a number of cell types. The expression of the CA3 gene is strictly tissue specific and present at high levels in skeletal muscle and much lower levels in cardiac and smooth muscle. A proportion of carriers of Duchenne muscle dystrophy have a higher CA3 level than normal. The gene spans 10.3 kb and contains seven exons and six introns. | 761 | ENSG00000164879 | CA3 |
| myoglobin | This gene encodes a member of the globin superfamily and is expressed in skeletal and cardiac muscles. The encoded protein is a haemoprotein contributing to intracellular oxygen storage and transcellular facilitated diffusion of oxygen. At least three alternatively spliced transcript variants encoding the same protein have been reported. | 4151 | ENSG00000198125 | MB |
| troponin T3, fast skeletal type | The binding of Ca(2+) to the trimeric troponin complex initiates the process of muscle contraction. Increased Ca(2+) concentrations produce a conformational change in the troponin complex that is transmitted to tropomyosin dimers situated along actin filaments. The altered conformation permits increased interaction between a myosin head and an actin filament which, ultimately, produces a muscle contraction. The troponin complex has protein subunits C, I, and T. Subunit C binds Ca(2+) and subunit I binds to actin and inhibits actin-myosin interaction. Subunit T binds the troponin complex to the tropomyosin complex and is also required for Ca(2+)-mediated activation of actomyosin ATPase activity. There are 3 different troponin T genes that encode tissue-specific isoforms of subunit T for fast skeletal-, slow skeletal-, and cardiac-muscle. This gene encodes fast skeletal troponin T protein; also known as troponin T type 3. Alternative splicing results in multiple transcript variants encoding additional distinct troponin T type 3 isoforms. A developmentally regulated switch between fetal/neonatal and adult troponin T type 3 isoforms occurs. Additional splice variants have been described but their biological validity has not been established. Mutations in this gene may cause distal arthrogryposis multiplex congenita type 2B (DA2B). | 7140 | ENSG00000130595 | TNNT3 |
| uncharacterized LOC101927055 | NA | 101927055 | ENSG00000237298 | LOC101927055 |
| TTN antisense RNA 1 | NA | 100506866 | ENSG00000237298 | TTN-AS1 |
| obscurin, cytoskeletal calmodulin and titin-interacting RhoGEF | The obscurin gene spans more than 150 kb, contains over 80 exons and encodes a protein of approximately 720 kDa. The encoded protein contains 68 Ig domains, 2 fibronectin domains, 1 calcium/calmodulin-binding domain, 1 RhoGEF domain with an associated PH domain, and 2 serine-threonine kinase domains. This protein belongs to the family of giant sacromeric signaling proteins that includes titin and nebulin, and may have a role in the organization of myofibrils during assembly and may mediate interactions between the sarcoplasmic reticulum and myofibrils. Alternatively spliced transcript variants encoding different isoforms have been identified. | 84033 | ENSG00000154358 | OBSCN |
| eukaryotic translation elongation factor 1 alpha 1 | This gene encodes an isoform of the alpha subunit of the elongation factor-1 complex, which is responsible for the enzymatic delivery of aminoacyl tRNAs to the ribosome. This isoform (alpha 1) is expressed in brain, placenta, lung, liver, kidney, and pancreas, and the other isoform (alpha 2) is expressed in brain, heart and skeletal muscle. This isoform is identified as an autoantigen in 66% of patients with Felty syndrome. This gene has been found to have multiple copies on many chromosomes, some of which, if not all, represent different pseudogenes. | 1915 | ENSG00000156508 | EEF1A1 |
| nebulin related anchoring protein | NA | 4892 | ENSG00000197893 | NRAP |
| enolase 3 | This gene encodes one of the three enolase isoenzymes found in mammals. This isoenzyme is found in skeletal muscle cells in the adult where it may play a role in muscle development and regeneration. A switch from alpha enolase to beta enolase occurs in muscle tissue during development in rodents. Mutations in this gene have be associated glycogen storage disease. Alternatively spliced transcript variants encoding different isoforms have been described. | 2027 | ENSG00000108515 | ENO3 |
| myosin binding protein C, fast type | This gene encodes a member of the myosin-binding protein C family. This family includes the fast-, slow- and cardiac-type isoforms, each of which is a myosin-associated protein found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The protein encoded by this locus is referred to as the fast-type isoform. Mutations in the related but distinct genes encoding the slow-type and cardiac-type isoforms have been associated with distal arthrogryposis, type 1 and hypertrophic cardiomyopathy, respectively. | 4606 | ENSG00000086967 | MYBPC2 |
| kelch like family member 41 | This gene is a member of the kelch-like family. The encoded protein contains a BACK domain, a BTB/POZ domain, and 5 Kelch repeats. This protein is thought to function in skeletal muscle development and maintenance. Mutations in this gene have been associated with nemaline myopathy (NM), a rare congenital muscle disorder. | 10324 | ENSG00000239474 | KLHL41 |
| protamine 2 | Protamines substitute for histones in the chromatin of sperm during the haploid phase of spermatogenesis, and are the major DNA-binding proteins in the nucleus of sperm in many vertebrates. They package the sperm DNA into a highly condensed complex in a volume less than 5% of a somatic cell nucleus. Many mammalian species have only one protamine (protamine 1); however, a few species, including human and mouse, have two. This gene encodes protamine 2, which is cleaved to give rise to a family of protamine 2 peptides. Alternatively spliced transcript variants have also been found for this gene. | 5620 | ENSG00000122304 | PRM2 |
| myosin, heavy chain 6, cardiac muscle, alpha | Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. | 4624 | ENSG00000197616 | MYH6 |
| troponin I2, fast skeletal type | This gene encodes a fast-twitch skeletal muscle protein, a member of the troponin I gene family, and a component of the troponin complex including troponin T, troponin C and troponin I subunits. The troponin complex, along with tropomyosin, is responsible for the calcium-dependent regulation of striated muscle contraction. Mouse studies show that this component is also present in vascular smooth muscle and may play a role in regulation of smooth muscle function. In addition to muscle tissues, this protein is found in corneal epithelium, cartilage where it is an inhibitor of angiogenesis to inhibit tumor growth and metastasis, and mammary gland where it functions as a co-activator of estrogen receptor-related receptor alpha. This protein also suppresses tumor growth in human ovarian carcinoma. Mutations in this gene cause myopathy and distal arthrogryposis type 2B. Alternatively spliced transcript variants have been found for this gene. | 7136 | ENSG00000130598 | TNNI2 |
| myosin light chain, phosphorylatable, fast skeletal muscle | NA | 29895 | ENSG00000180209 | MYLPF |
| NPPA antisense RNA 1 | NA | ENSG00000242349 | ENSG00000242349 | NPPA-AS1 |
| poly(A) binding protein cytoplasmic 1 | This gene encodes a poly(A) binding protein. The protein shuttles between the nucleus and cytoplasm and binds to the 3’ poly(A) tail of eukaryotic messenger RNAs via RNA-recognition motifs. The binding of this protein to poly(A) promotes ribosome recruitment and translation initiation; it is also required for poly(A) shortening which is the first step in mRNA decay. The gene is part of a small gene family including three protein-coding genes and several pseudogenes. | 26986 | ENSG00000070756 | PABPC1 |
| myosin, heavy chain 9, non-muscle | This gene encodes a conventional non-muscle myosin; this protein should not be confused with the unconventional myosin-9a or 9b (MYO9A or MYO9B). The encoded protein is a myosin IIA heavy chain that contains an IQ domain and a myosin head-like domain which is involved in several important functions, including cytokinesis, cell motility and maintenance of cell shape. Defects in this gene have been associated with non-syndromic sensorineural deafness autosomal dominant type 17, Epstein syndrome, Alport syndrome with macrothrombocytopenia, Sebastian syndrome, Fechtner syndrome and macrothrombocytopenia with progressive sensorineural deafness. | 4627 | ENSG00000100345 | MYH9 |
| myosin light chain 7 | NA | 58498 | ENSG00000106631 | MYL7 |
| troponin C1, slow skeletal and cardiac type | Troponin is a central regulatory protein of striated muscle contraction, and together with tropomyosin, is located on the actin filament. Troponin consists of 3 subunits: TnI, which is the inhibitor of actomyosin ATPase; TnT, which contains the binding site for tropomyosin; and TnC, the protein encoded by this gene. The binding of calcium to TnC abolishes the inhibitory action of TnI, thus allowing the interaction of actin with myosin, the hydrolysis of ATP, and the generation of tension. Mutations in this gene are associated with cardiomyopathy dilated type 1Z. | 7134 | ENSG00000114854 | TNNC1 |
| H3 histone, family 3B (H3.3B) | Histones are basic nuclear proteins that are responsible for the nucleosome structure of the chromosomal fiber in eukaryotes. Two molecules of each of the four core histones (H2A, H2B, H3, and H4) form an octamer, around which approximately 146 bp of DNA is wrapped in repeating units, called nucleosomes. The linker histone, H1, interacts with linker DNA between nucleosomes and functions in the compaction of chromatin into higher order structures. This gene contains introns and its mRNA is polyadenylated, unlike most histone genes. The protein encoded by this gene is a replication-independent histone that is a member of the histone H3 family. Pseudogenes of this gene have been identified on the X chromosome, and on chromosomes 5, 13 and 17. | 3021 | ENSG00000132475 | H3F3B |
| eukaryotic translation elongation factor 1 alpha 2 | This gene encodes an isoform of the alpha subunit of the elongation factor-1 complex, which is responsible for the enzymatic delivery of aminoacyl tRNAs to the ribosome. This isoform (alpha 2) is expressed in brain, heart and skeletal muscle, and the other isoform (alpha 1) is expressed in brain, placenta, lung, liver, kidney, and pancreas. This gene may be critical in the development of ovarian cancer. | 1917 | ENSG00000101210 | EEF1A2 |
| myozenin 1 | The protein encoded by this gene is primarily expressed in the skeletal muscle, and belongs to the myozenin family. Members of this family function as calcineurin-interacting proteins that help tether calcineurin to the sarcomere of cardiac and skeletal muscle. They play an important role in modulation of calcineurin signaling. | 58529 | ENSG00000177791 | MYOZ1 |
| eukaryotic translation elongation factor 1 alpha 1 pseudogene 5 | NA | ENSG00000196205 | ENSG00000196205 | EEF1A1P5 |
| tropomyosin 3 | This gene encodes a member of the tropomyosin family of actin-binding proteins. Tropomyosins are dimers of coiled-coil proteins that provide stability to actin filaments and regulate access of other actin-binding proteins. Mutations in this gene result in autosomal dominant nemaline myopathy and other muscle disorders. This locus is involved in translocations with other loci, including anaplastic lymphoma receptor tyrosine kinase (ALK) and neurotrophic tyrosine kinase receptor type 1 (NTRK1), which result in the formation of fusion proteins that act as oncogenes. There are numerous pseudogenes for this gene on different chromosomes. Alternative splicing results in multiple transcript variants. | 7170 | ENSG00000143549 | TPM3 |
| protamine 1 | NA | 5619 | ENSG00000175646 | PRM1 |
| ribosomal protein L3 | Ribosomes, the complexes that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L3P family of ribosomal proteins and it is located in the cytoplasm. The protein can bind to the HIV-1 TAR mRNA, and it has been suggested that the protein contributes to tat-mediated transactivation. This gene is co-transcribed with several small nucleolar RNA genes, which are located in several of this gene’s introns. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | 6122 | ENSG00000100316 | RPL3 |
| hemoglobin subunit alpha 2 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | 3040 | ENSG00000188536 | HBA2 |
| DEAD-box helicase 5 | DEAD box proteins, characterized by the conserved motif Asp-Glu-Ala-Asp (DEAD), are putative RNA helicases. They are implicated in a number of cellular processes involving alteration of RNA secondary structure, such as translation initiation, nuclear and mitochondrial splicing, and ribosome and spliceosome assembly. Based on their distribution patterns, some members of this family are believed to be involved in embryogenesis, spermatogenesis, and cellular growth and division. This gene encodes a DEAD box protein, which is a RNA-dependent ATPase, and also a proliferation-associated nuclear antigen, specifically reacting with the simian virus 40 tumor antigen. Alternative splicing results in multiple transcript variants. | 1655 | ENSG00000108654 | DDX5 |
| uncharacterized LOC100129518 | NA | 100129518 | ENSG00000112096 | LOC100129518 |
| superoxide dismutase 2, mitochondrial | This gene is a member of the iron/manganese superoxide dismutase family. It encodes a mitochondrial protein that forms a homotetramer and binds one manganese ion per subunit. This protein binds to the superoxide byproducts of oxidative phosphorylation and converts them to hydrogen peroxide and diatomic oxygen. Mutations in this gene have been associated with idiopathic cardiomyopathy (IDC), premature aging, sporadic motor neuron disease, and cancer. Alternative splicing of this gene results in multiple transcript variants. A related pseudogene has been identified on chromosome 1. | 6648 | ENSG00000112096 | SOD2 |
| dual specificity phosphatase 1 | The expression of DUSP1 gene is induced in human skin fibroblasts by oxidative/heat stress and growth factors. It specifies a protein with structural features similar to members of the non-receptor-type protein-tyrosine phosphatase family, and which has significant amino-acid sequence similarity to a Tyr/Ser-protein phosphatase encoded by the late gene H1 of vaccinia virus. The bacterially expressed and purified DUSP1 protein has intrinsic phosphatase activity, and specifically inactivates mitogen-activated protein (MAP) kinase in vitro by the concomitant dephosphorylation of both its phosphothreonine and phosphotyrosine residues. Furthermore, it suppresses the activation of MAP kinase by oncogenic ras in extracts of Xenopus oocytes. Thus, DUSP1 may play an important role in the human cellular response to environmental stress as well as in the negative regulation of cellular proliferation. | 1843 | ENSG00000120129 | DUSP1 |
| LDL receptor related protein 1 | This gene encodes a member of the low-density lipoprotein receptor family of proteins. The encoded preproprotein is proteolytically processed by furin to generate 515 kDa and 85 kDa subunits that form the mature receptor (PMID: 8546712). This receptor is involved in several cellular processes, including intracellular signaling, lipid homeostasis, and clearance of apoptotic cells. In addition, the encoded protein is necessary for the alpha 2-macroglobulin-mediated clearance of secreted amyloid precursor protein and beta-amyloid, the main component of amyloid plaques found in Alzheimer patients. Expression of this gene decreases with age and has been found to be lower than controls in brain tissue from Alzheimer’s disease patients. | 4035 | ENSG00000123384 | LRP1 |
| tropomyosin 4 | This gene encodes a member of the tropomyosin family of actin-binding proteins involved in the contractile system of striated and smooth muscles and the cytoskeleton of non-muscle cells. Tropomyosins are dimers of coiled-coil proteins that polymerize end-to-end along the major groove in most actin filaments. They provide stability to the filaments and regulate access of other actin-binding proteins. In muscle cells, they regulate muscle contraction by controlling the binding of myosin heads to the actin filament. Multiple transcript variants encoding different isoforms have been found for this gene. | 7171 | ENSG00000167460 | TPM4 |
| actin, alpha, cardiac muscle 1 | Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC). | 70 | ENSG00000159251 | ACTC1 |
| actinin alpha 2 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a muscle-specific, alpha actinin isoform that is expressed in both skeletal and cardiac muscles. Several transcript variants encoding different isoforms have been found for this gene. | 88 | ENSG00000077522 | ACTN2 |
| myotilin | This gene encodes a cystoskeletal protein which plays a significant role in the stability of thin filaments during muscle contraction. This protein binds F-actin, crosslinks actin filaments, and prevents latrunculin A-induced filament disassembly. Mutations in this gene have been associated with limb-girdle muscular dystrophy and myofibrillar myopathies. Several alternatively spliced transcript variants of this gene have been described, but the full-length nature of some of these variants has not been determined. | 9499 | ENSG00000120729 | MYOT |
| calsequestrin 1 | This gene encodes the skeletal muscle specific member of the calsequestrin protein family. Calsequestrin functions as a luminal sarcoplasmic reticulum calcium sensor in both cardiac and skeletal muscle cells. This protein, also known as calmitine, functions as a calcium regulator in the mitochondria of skeletal muscle. This protein is absent in patients with Duchenne and Becker types of muscular dystrophy. | 844 | ENSG00000143318 | CASQ1 |
| bridging integrator 1 | This gene encodes several isoforms of a nucleocytoplasmic adaptor protein, one of which was initially identified as a MYC-interacting protein with features of a tumor suppressor. Isoforms that are expressed in the central nervous system may be involved in synaptic vesicle endocytosis and may interact with dynamin, synaptojanin, endophilin, and clathrin. Isoforms that are expressed in muscle and ubiquitously expressed isoforms localize to the cytoplasm and nucleus and activate a caspase-independent apoptotic process. Studies in mouse suggest that this gene plays an important role in cardiac muscle development. Alternate splicing of the gene results in several transcript variants encoding different isoforms. Aberrant splice variants expressed in tumor cell lines have also been described. | 274 | ENSG00000136717 | BIN1 |
| thyroglobulin | Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. | 7038 | ENSG00000042832 | TG |
| titin-cap | Sarcomere assembly is regulated by the muscle protein titin. Titin is a giant elastic protein with kinase activity that extends half the length of a sarcomere. It serves as a scaffold to which myofibrils and other muscle related proteins are attached. This gene encodes a protein found in striated and cardiac muscle that binds to the titin Z1-Z2 domains and is a substrate of titin kinase, interactions thought to be critical to sarcomere assembly. Mutations in this gene are associated with limb-girdle muscular dystrophy type 2G. | 8557 | ENSG00000173991 | TCAP |
| SH3 and cysteine rich domain 3 | The protein encoded by this gene is a component of the excitation-contraction coupling machinery of muscles. This protein is a member of the Stac gene family and contains an N-terminal cysteine-rich domain and two SH3 domains. Mutations in this gene are a cause of Native American myopathy. | 246329 | ENSG00000185482 | STAC3 |
| aldolase, fructose-bisphosphate A | The protein encoded by this gene, Aldolase A (fructose-bisphosphate aldolase), is a glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Three aldolase isozymes (A, B, and C), encoded by three different genes, are differentially expressed during development. Aldolase A is found in the developing embryo and is produced in even greater amounts in adult muscle. Aldolase A expression is repressed in adult liver, kidney and intestine and similar to aldolase C levels in brain and other nervous tissue. Aldolase A deficiency has been associated with myopathy and hemolytic anemia. Alternative splicing and alternative promoter usage results in multiple transcript variants. Related pseudogenes have been identified on chromosomes 3 and 10. | 226 | ENSG00000149925 | ALDOA |
| myosin light chain 6 | Myosin is a hexameric ATPase cellular motor protein. It is composed of two heavy chains, two nonphosphorylatable alkali light chains, and two phosphorylatable regulatory light chains. This gene encodes a myosin alkali light chain that is expressed in smooth muscle and non-muscle tissues. Genomic sequences representing several pseudogenes have been described and two transcript variants encoding different isoforms have been identified for this gene. | 4637 | ENSG00000092841 | MYL6 |
| glutathione peroxidase 3 | This gene product belongs to the glutathione peroxidase family, which functions in the detoxification of hydrogen peroxide. It contains a selenocysteine (Sec) residue at its active site. The selenocysteine is encoded by the UGA codon, which normally signals translation termination. The 3’ UTR of Sec-containing genes have a common stem-loop structure, the sec insertion sequence (SECIS), which is necessary for the recognition of UGA as a Sec codon rather than as a stop signal. | 2878 | ENSG00000211445 | GPX3 |
| PHD finger protein 7 | Spermatogenesis is a complex process regulated by extracellular and intracellular factors as well as cellular interactions among interstitial cells of the testis, Sertoli cells, and germ cells. This gene is expressed in the testis in Sertoli cells but not germ cells. The protein encoded by this gene contains plant homeodomain (PHD) finger domains, also known as leukemia associated protein (LAP) domains, believed to be involved in transcriptional regulation. The protein, which localizes to the nucleus of transfected cells, has been implicated in the transcriptional regulation of spermatogenesis. Alternate splicing results in multiple transcript variants of this gene. | 51533 | ENSG00000010318 | PHF7 |
| amyloid beta precursor like protein 2 | This gene encodes amyloid precursor- like protein 2 (APLP2), which is a member of the APP (amyloid precursor protein) family including APP, APLP1 and APLP2. This protein is ubiquitously expressed. It contains heparin-, copper- and zinc- binding domains at the N-terminus, BPTI/Kunitz inhibitor and E2 domains in the middle region, and transmembrane and intracellular domains at the C-terminus. This protein interacts with major histocompatibility complex (MHC) class I molecules. The synergy of this protein and the APP is required to mediate neuromuscular transmission, spatial learning and synaptic plasticity. This protein has been implicated in the pathogenesis of Alzheimer’s disease. Multiple alternatively spliced transcript variants encoding different isoforms have been identified. | 334 | ENSG00000084234 | APLP2 |
| troponin T2, cardiac type | The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. Mutations in this gene have been associated with familial hypertrophic cardiomyopathy as well as with dilated cardiomyopathy. Transcripts for this gene undergo alternative splicing that results in many tissue-specific isoforms, however, the full-length nature of some of these variants has not yet been determined. | 7139 | ENSG00000118194 | TNNT2 |
| filamin C | This gene encodes one of three related filamin genes, specifically gamma filamin. These filamin proteins crosslink actin filaments into orthogonal networks in cortical cytoplasm and participate in the anchoring of membrane proteins for the actin cytoskeleton. Three functional domains exist in filamin: an N-terminal filamentous actin-binding domain, a C-terminal self-association domain, and a membrane glycoprotein-binding domain. Two transcript variants encoding different isoforms have been found for this gene. | 2318 | ENSG00000128591 | FLNC |
| tripartite motif containing 63 | This gene encodes a member of the RING zinc finger protein family found in striated muscle and iris. The product of this gene is an E3 ubiquitin ligase that localizes to the Z-line and M-line lattices of myofibrils. This protein plays an important role in the atrophy of skeletal and cardiac muscle and is required for the degradation of myosin heavy chain proteins, myosin light chain, myosin binding protein, and for muscle-type creatine kinase. | 84676 | ENSG00000158022 | TRIM63 |
| integral membrane protein 2B | Amyloid precursor proteins are processed by beta-secretase and gamma-secretase to produce beta-amyloid peptides which form the characteristic plaques of Alzheimer disease. This gene encodes a transmembrane protein which is processed at the C-terminus by furin or furin-like proteases to produce a small secreted peptide which inhibits the deposition of beta-amyloid. Mutations which result in extension of the C-terminal end of the encoded protein, thereby increasing the size of the secreted peptide, are associated with two neurogenerative diseases, familial British dementia and familial Danish dementia. | 9445 | ENSG00000136156 | ITM2B |
| CD81 molecule | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. This encoded protein is a cell surface glycoprotein that is known to complex with integrins. This protein appears to promote muscle cell fusion and support myotube maintenance. Also it may be involved in signal transduction. This gene is localized in the tumor-suppressor gene region and thus it is a candidate gene for malignancies. Two transcript variants encoding different isoforms have been found for this gene. | 975 | ENSG00000110651 | CD81 |
| prothymosin, alpha | NA | 5757 | ENSG00000187514 | PTMA |
| prothymosin alpha-like | NA | 728026 | ENSG00000187514 | LOC728026 |
| KIAA0754 | NA | 643314 | ENSG00000127603 | KIAA0754 |
| microtubule-actin crosslinking factor 1 | This gene encodes a large protein containing numerous spectrin and leucine-rich repeat (LRR) domains. The encoded protein is a member of a family of proteins that form bridges between different cytoskeletal elements. This protein facilitates actin-microtubule interactions at the cell periphery and couples the microtubule network to cellular junctions. Alternative splicing results in multiple transcript variants, but the full-length nature of some of these variants has not been determined. | 23499 | ENSG00000127603 | MACF1 |
| eukaryotic translation initiation factor 4 gamma 2 | Translation initiation is mediated by specific recognition of the cap structure by eukaryotic translation initiation factor 4F (eIF4F), which is a cap binding protein complex that consists of three subunits: eIF4A, eIF4E and eIF4G. The protein encoded by this gene shares similarity with the C-terminal region of eIF4G that contains the binding sites for eIF4A and eIF3; eIF4G, in addition, contains a binding site for eIF4E at the N-terminus. Unlike eIF4G, which supports cap-dependent and independent translation, this gene product functions as a general repressor of translation by forming translationally inactive complexes. In vitro and in vivo studies indicate that translation of this mRNA initiates exclusively at a non-AUG (GUG) codon. Alternatively spliced transcript variants encoding different isoforms of this gene have been described. | 1982 | ENSG00000110321 | EIF4G2 |
| four and a half LIM domains 3 | The protein encoded by this gene is a member of a family of proteins containing a four-and-a-half LIM domain, which is a highly conserved double zinc finger motif. The encoded protein has been shown to interact with the cancer developmental regulators SMAD2, SMAD3, and SMAD4, the skeletal muscle myogenesis protein MyoD, and the high-affinity IgE beta chain regulator MZF-1. This protein may be involved in tumor suppression, repression of MyoD expression, and repression of IgE receptor expression. Two transcript variants encoding different isoforms have been found for this gene. | 2275 | ENSG00000183386 | FHL3 |
| ras homolog family member A | This gene encodes a member of the Rho family of small GTPases, which cycle between inactive GDP-bound and active GTP-bound states and function as molecular switches in signal transduction cascades. Rho proteins promote reorganization of the actin cytoskeleton and regulate cell shape, attachment, and motility. Overexpression of this gene is associated with tumor cell proliferation and metastasis. Multiple alternatively spliced variants have been identified. | 387 | ENSG00000067560 | RHOA |
| ras homolog family member B | NA | 388 | ENSG00000143878 | RHOB |
| talin 1 | This gene encodes a cytoskeletal protein that is concentrated in areas of cell-substratum and cell-cell contacts. The encoded protein plays a significant role in the assembly of actin filaments and in spreading and migration of various cell types, including fibroblasts and osteoclasts. It codistributes with integrins in the cell surface membrane in order to assist in the attachment of adherent cells to extracellular matrices and of lymphocytes to other cells. The N-terminus of this protein contains elements for localization to cell-extracellular matrix junctions. The C-terminus contains binding sites for proteins such as beta-1-integrin, actin, and vinculin. | 7094 | ENSG00000137076 | TLN1 |
| eukaryotic translation initiation factor 4A1 | NA | 1973 | ENSG00000161960 | EIF4A1 |
| amyloid beta precursor protein | This gene encodes a cell surface receptor and transmembrane precursor protein that is cleaved by secretases to form a number of peptides. Some of these peptides are secreted and can bind to the acetyltransferase complex APBB1/TIP60 to promote transcriptional activation, while others form the protein basis of the amyloid plaques found in the brains of patients with Alzheimer disease. In addition, two of the peptides are antimicrobial peptides, having been shown to have bacteriocidal and antifungal activities. Mutations in this gene have been implicated in autosomal dominant Alzheimer disease and cerebroarterial amyloidosis (cerebral amyloid angiopathy). Multiple transcript variants encoding several different isoforms have been found for this gene. | 351 | ENSG00000142192 | APP |
| splicing factor 3b subunit 1 | This gene encodes subunit 1 of the splicing factor 3b protein complex. Splicing factor 3b, together with splicing factor 3a and a 12S RNA unit, forms the U2 small nuclear ribonucleoproteins complex (U2 snRNP). The splicing factor 3b/3a complex binds pre-mRNA upstream of the intron’s branch site in a sequence independent manner and may anchor the U2 snRNP to the pre-mRNA. Splicing factor 3b is also a component of the minor U12-type spliceosome. The carboxy-terminal two-thirds of subunit 1 have 22 non-identical, tandem HEAT repeats that form rod-like, helical structures. Alternative splicing results in multiple transcript variants encoding different isoforms. | 23451 | ENSG00000115524 | SF3B1 |
| lysosomal associated membrane protein 1 | The protein encoded by this gene is a member of a family of membrane glycoproteins. This glycoprotein provides selectins with carbohydrate ligands. It may also play a role in tumor cell metastasis. | 3916 | ENSG00000185896 | LAMP1 |
| heterogeneous nuclear ribonucleoprotein U | This gene belongs to the subfamily of ubiquitously expressed heterogeneous nuclear ribonucleoproteins (hnRNPs). The hnRNPs are RNA binding proteins and they form complexes with heterogeneous nuclear RNA (hnRNA). These proteins are associated with pre-mRNAs in the nucleus and appear to influence pre-mRNA processing and other aspects of mRNA metabolism and transport. While all of the hnRNPs are present in the nucleus, some seem to shuttle between the nucleus and the cytoplasm. The hnRNP proteins have distinct nucleic acid binding properties. The protein encoded by this gene contains a RNA binding domain and scaffold-associated region (SAR)-specific bipartite DNA-binding domain. This protein is also thought to be involved in the packaging of hnRNA into large ribonucleoprotein complexes. During apoptosis, this protein is cleaved in a caspase-dependent way. Cleavage occurs at the SALD site, resulting in a loss of DNA-binding activity and a concomitant detachment of this protein from nuclear structural sites. But this cleavage does not affect the function of the encoded protein in RNA metabolism. At least two alternatively spliced transcript variants have been identified for this gene. | 3192 | ENSG00000153187 | HNRNPU |
| trans-golgi network protein 2 | This gene encodes a type I integral membrane protein that is localized to the trans-Golgi network, a major sorting station for secretory and membrane proteins. The encoded protein cycles between early endosomes and the trans-Golgi network, and may play a role in exocytic vesicle formation. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 10618 | ENSG00000152291 | TGOLN2 |
| calnexin | This gene encodes a member of the calnexin family of molecular chaperones. The encoded protein is a calcium-binding, endoplasmic reticulum (ER)-associated protein that interacts transiently with newly synthesized N-linked glycoproteins, facilitating protein folding and assembly. It may also play a central role in the quality control of protein folding by retaining incorrectly folded protein subunits within the ER for degradation. Alternatively spliced transcript variants encoding the same protein have been described. | 821 | ENSG00000127022 | CANX |
| Y-box binding protein 1 | This gene encodes a highly conserved cold shock domain protein that has broad nucleic acid binding properties. The encoded protein functions as both a DNA and RNA binding protein and has been implicated in numerous cellular processes including regulation of transcription and translation, pre-mRNA splicing, DNA reparation and mRNA packaging. This protein is also a component of messenger ribonucleoprotein (mRNP) complexes and may have a role in microRNA processing. This protein can be secreted through non-classical pathways and functions as an extracellular mitogen. Aberrant expression of the gene is associated with cancer proliferation in numerous tissues. This gene may be a prognostic marker for poor outcome and drug resistance in certain cancers. Alternate splicing results in multiple transcript variants. Pseudogenes of this gene are found on multiple chromosomes. | 4904 | ENSG00000065978 | YBX1 |
| decorin | This gene encodes a member of the small leucine-rich proteoglycan family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature protein. This protein plays a role in collagen fibril assembly. Binding of this protein to multiple cell surface receptors mediates its role in tumor suppression, including a stimulatory effect on autophagy and inflammation and an inhibitory effect on angiogenesis and tumorigenesis. This gene and the related gene biglycan are thought to be the result of a gene duplication. Mutations in this gene are associated with congenital stromal corneal dystrophy in human patients. | 1634 | ENSG00000011465 | DCN |
| spectrin alpha, non-erythrocytic 1 | Spectrins are a family of filamentous cytoskeletal proteins that function as essential scaffold proteins that stabilize the plasma membrane and organize intracellular organelles. Spectrins are composed of alpha and beta dimers that associate to form tetramers linked in a head-to-head arrangement. This gene encodes an alpha spectrin that is specifically expressed in nonerythrocytic cells. The encoded protein has been implicated in other cellular functions including DNA repair and cell cycle regulation. Mutations in this gene are the cause of early infantile epileptic encephalopathy-5. Alternate splicing results in multiple transcript variants. | 6709 | ENSG00000197694 | SPTAN1 |
| cyclin I | The protein encoded by this gene belongs to the highly conserved cyclin family, whose members are characterized by a dramatic periodicity in protein abundance through the cell cycle. Cyclins function as regulators of CDK kinases. Different cyclins exhibit distinct expression and degradation patterns which contribute to the temporal coordination of each mitotic event. This cyclin shows the highest similarity with cyclin G. The transcript of this gene was found to be expressed constantly during cell cycle progression. The function of this cyclin has not yet been determined. | 10983 | ENSG00000118816 | CCNI |
| myosin binding protein C, cardiac | MYBPC3 encodes the cardiac isoform of myosin-binding protein C. Myosin-binding protein C is a myosin-associated protein found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. MYBPC3, the cardiac isoform, is expressed exclussively in heart muscle. Regulatory phosphorylation of the cardiac isoform in vivo by cAMP-dependent protein kinase (PKA) upon adrenergic stimulation may be linked to modulation of cardiac contraction. Mutations in MYBPC3 are one cause of familial hypertrophic cardiomyopathy. | 4607 | ENSG00000134571 | MYBPC3 |
| S100 calcium binding protein A9 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and altered expression of this protein is associated with the disease cystic fibrosis. This antimicrobial protein exhibits antifungal and antibacterial activity. | 6280 | ENSG00000163220 | S100A9 |
| alpha-2-macroglobulin | Alpha-2-macroglobulin is a protease inhibitor and cytokine transporter. It inhibits many proteases, including trypsin, thrombin and collagenase. A2M is implicated in Alzheimer disease (AD) due to its ability to mediate the clearance and degradation of A-beta, the major component of beta-amyloid deposits. | 2 | ENSG00000175899 | A2M |
| AHNAK nucleoprotein | NA | 79026 | ENSG00000124942 | AHNAK |
| myosin light chain 4 | Myosin is a hexameric ATPase cellular motor protein. It is composed of two myosin heavy chains, two nonphosphorylatable myosin alkali light chains, and two phosphorylatable myosin regulatory light chains. This gene encodes a myosin alkali light chain that is found in embryonic muscle and adult atria. Two alternatively spliced transcript variants encoding the same protein have been found for this gene. | 4635 | ENSG00000198336 | MYL4 |
| actinin alpha 4 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a nonmuscle, alpha actinin isoform which is concentrated in the cytoplasm, and thought to be involved in metastatic processes. Mutations in this gene have been associated with focal and segmental glomerulosclerosis. | 81 | ENSG00000130402 | ACTN4 |
| heterogeneous nuclear ribonucleoprotein K | This gene belongs to the subfamily of ubiquitously expressed heterogeneous nuclear ribonucleoproteins (hnRNPs). The hnRNPs are RNA binding proteins and they complex with heterogeneous nuclear RNA (hnRNA). These proteins are associated with pre-mRNAs in the nucleus and appear to influence pre-mRNA processing and other aspects of mRNA metabolism and transport. While all of the hnRNPs are present in the nucleus, some seem to shuttle between the nucleus and the cytoplasm. The hnRNP proteins have distinct nucleic acid binding properties. The protein encoded by this gene is located in the nucleoplasm and has three repeats of KH domains that binds to RNAs. It is distinct among other hnRNP proteins in its binding preference; it binds tenaciously to poly(C). This protein is also thought to have a role during cell cycle progession. Several alternatively spliced transcript variants have been described for this gene, however, not all of them are fully characterized. | 3190 | ENSG00000165119 | HNRNPK |
| uncharacterized LOC100507537 | NA | 100507537 | ENSG00000240045 | LOC100507537 |
| nischarin | This gene encodes a nonadrenergic imidazoline-1 receptor protein that localizes to the cytosol and anchors to the inner layer of the plasma membrane. The orthologous mouse protein has been shown to influence cytoskeletal organization and cell migration by binding to alpha-5-beta-1 integrin. In humans, this protein has been shown to bind to the adapter insulin receptor substrate 4 (IRS4) to mediate translocation of alpha-5 integrin from the cell membrane to endosomes. Expression of this protein was reduced in human breast cancers while its overexpression reduced tumor growth and metastasis; possibly by limiting the expression of alpha-5 integrin. In human cardiac tissue, this gene was found to affect cell growth and death while in neural tissue it affected neuronal growth and differentiation. Alternative splicing results in multiple transcript variants encoding differerent isoforms. Some isoforms lack the expected C-terminal domains of a functional imidazoline receptor. | 11188 | ENSG00000010322 | NISCH |
| poly(rC) binding protein 2 | The protein encoded by this gene appears to be multifunctional. Along with PCBP-1 and hnRNPK, it is one of the major cellular poly(rC)-binding proteins. The encoded protein contains three K-homologous (KH) domains which may be involved in RNA binding. Together with PCBP-1, this protein also functions as a translational coactivator of poliovirus RNA via a sequence-specific interaction with stem-loop IV of the IRES, promoting poliovirus RNA replication by binding to its 5’-terminal cloverleaf structure. It has also been implicated in translational control of the 15-lipoxygenase mRNA, human papillomavirus type 16 L2 mRNA, and hepatitis A virus RNA. The encoded protein is also suggested to play a part in formation of a sequence-specific alpha-globin mRNP complex which is associated with alpha-globin mRNA stability. This multiexon structural mRNA is thought to be retrotransposed to generate PCBP-1, an intronless gene with functions similar to that of PCBP2. This gene and PCBP-1 have paralogous genes (PCBP3 and PCBP4) which are thought to have arisen as a result of duplication events of entire genes. Thsi gene also has two processed pseudogenes (PCBP2P1 and PCBP2P2). Multiple transcript variants encoding different isoforms have been found for this gene. | 5094 | ENSG00000197111 | PCBP2 |
| eukaryotic translation elongation factor 1 alpha 1 pseudogene 6 | NA | ENSG00000233476 | ENSG00000233476 | EEF1A1P6 |
| polymerase (RNA) II subunit A | This gene encodes the largest subunit of RNA polymerase II, the polymerase responsible for synthesizing messenger RNA in eukaryotes. The product of this gene contains a carboxy terminal domain composed of heptapeptide repeats that are essential for polymerase activity. These repeats contain serine and threonine residues that are phosphorylated in actively transcribing RNA polymerase. In addition, this subunit, in combination with several other polymerase subunits, forms the DNA binding domain of the polymerase, a groove in which the DNA template is transcribed into RNA. | 5430 | ENSG00000181222 | POLR2A |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",19,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[20,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| X_id | symbol | summary | query | name | notfound |
|---|---|---|---|---|---|
| 3858 | KRT10 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | ENSG00000186395 | keratin 10 | NA |
| 3848 | KRT1 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | ENSG00000167768 | keratin 1 | NA |
| 4155 | MBP | The protein encoded by the classic MBP gene is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. However, MBP-related transcripts are also present in the bone marrow and the immune system. These mRNAs arise from the long MBP gene (otherwise called ‘Golli-MBP’) that contains 3 additional exons located upstream of the classic MBP exons. Alternative splicing from the Golli and the MBP transcription start sites gives rise to 2 sets of MBP-related transcripts and gene products. The Golli mRNAs contain 3 exons unique to Golli-MBP, spliced in-frame to 1 or more MBP exons. They encode hybrid proteins that have N-terminal Golli aa sequence linked to MBP aa sequence. The second family of transcripts contain only MBP exons and produce the well characterized myelin basic proteins. This complex gene structure is conserved among species suggesting that the MBP transcription unit is an integral part of the Golli transcription unit and that this arrangement is important for the function and/or regulation of these genes. | ENSG00000197971 | myelin basic protein | NA |
| 3849 | KRT2 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | ENSG00000172867 | keratin 2 | NA |
| 5620 | PRM2 | Protamines substitute for histones in the chromatin of sperm during the haploid phase of spermatogenesis, and are the major DNA-binding proteins in the nucleus of sperm in many vertebrates. They package the sperm DNA into a highly condensed complex in a volume less than 5% of a somatic cell nucleus. Many mammalian species have only one protamine (protamine 1); however, a few species, including human and mouse, have two. This gene encodes protamine 2, which is cleaved to give rise to a family of protamine 2 peptides. Alternatively spliced transcript variants have also been found for this gene. | ENSG00000122304 | protamine 2 | NA |
| ENSG00000266844 | RP11-862L9.3 | NA | ENSG00000266844 | NA | NA |
| 3860 | KRT13 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | ENSG00000171401 | keratin 13 | NA |
| 64065 | PERP | NA | ENSG00000112378 | PERP, TP53 apoptosis effector | NA |
| 93099 | DMKN | This gene is upregulated in inflammatory diseases, and it was first observed as expressed in the differentiated layers of skin. The most interesting aspect of this gene is the differential use of promoters and terminators to generate isoforms with unique cellular distributions and domain components. Alternatively spliced transcript variants encoding different isoforms have been identified for this gene. | ENSG00000161249 | dermokine | NA |
| 5619 | PRM1 | NA | ENSG00000175646 | protamine 1 | NA |
| 1674 | DES | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | ENSG00000175084 | desmin | NA |
| 3852 | KRT5 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the basal layer of the epidermis with family member KRT14. Mutations in these genes have been associated with a complex of diseases termed epidermolysis bullosa simplex. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | ENSG00000186081 | keratin 5 | NA |
| 51806 | CALML5 | This gene encodes a novel calcium binding protein expressed in the epidermis and related to the calmodulin family of calcium binding proteins. Functional studies with recombinant protein demonstrate it does bind calcium and undergoes a conformational change when it does so. Abundant expression is detected only in reconstructed epidermis and is restricted to differentiating keratinocytes. In addition, it can associate with transglutaminase 3, shown to be a key enzyme in the terminal differentiation of keratinocytes. | ENSG00000178372 | calmodulin like 5 | NA |
| 3861 | KRT14 | This gene encodes a member of the keratin family, the most diverse group of intermediate filaments. This gene product, a type I keratin, is usually found as a heterotetramer with two keratin 5 molecules, a type II keratin. Together they form the cytoskeleton of epithelial cells. Mutations in the genes for these keratins are associated with epidermolysis bullosa simplex. At least one pseudogene has been identified at 17p12-p11. | ENSG00000186847 | keratin 14 | NA |
| 5166 | PDK4 | This gene is a member of the PDK/BCKDK protein kinase family and encodes a mitochondrial protein with a histidine kinase domain. This protein is located in the matrix of the mitrochondria and inhibits the pyruvate dehydrogenase complex by phosphorylating one of its subunits, thereby contributing to the regulation of glucose metabolism. Expression of this gene is regulated by glucocorticoids, retinoic acid and insulin. | ENSG00000004799 | pyruvate dehydrogenase kinase 4 | NA |
| 283131 | NEAT1 | This gene produces a long non-coding RNA (lncRNA) transcribed from the multiple endocrine neoplasia locus. This lncRNA is retained in the nucleus where it forms the core structural component of the paraspeckle sub-organelles. It may act as a transcriptional regulator for numerous genes, including some genes involved in cancer progression. | ENSG00000245532 | nuclear paraspeckle assembly transcript 1 (non-protein coding) | NA |
| 51533 | PHF7 | Spermatogenesis is a complex process regulated by extracellular and intracellular factors as well as cellular interactions among interstitial cells of the testis, Sertoli cells, and germ cells. This gene is expressed in the testis in Sertoli cells but not germ cells. The protein encoded by this gene contains plant homeodomain (PHD) finger domains, also known as leukemia associated protein (LAP) domains, believed to be involved in transcriptional regulation. The protein, which localizes to the nucleus of transfected cells, has been implicated in the transcriptional regulation of spermatogenesis. Alternate splicing results in multiple transcript variants of this gene. | ENSG00000010318 | PHD finger protein 7 | NA |
| 7168 | TPM1 | This gene is a member of the tropomyosin family of highly conserved, widely distributed actin-binding proteins involved in the contractile system of striated and smooth muscles and the cytoskeleton of non-muscle cells. Tropomyosin is composed of two alpha-helical chains arranged as a coiled-coil. It is polymerized end to end along the two grooves of actin filaments and provides stability to the filaments. The encoded protein is one type of alpha helical chain that forms the predominant tropomyosin of striated muscle, where it also functions in association with the troponin complex to regulate the calcium-dependent interaction of actin and myosin during muscle contraction. In smooth muscle and non-muscle cells, alternatively spliced transcript variants encoding a range of isoforms have been described. Mutations in this gene are associated with type 3 familial hypertrophic cardiomyopathy. | ENSG00000140416 | tropomyosin 1 (alpha) | NA |
| 58473 | PLEKHB1 | NA | ENSG00000021300 | pleckstrin homology domain containing B1 | NA |
| 3851 | KRT4 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | ENSG00000170477 | keratin 4 | NA |
| 2670 | GFAP | This gene encodes one of the major intermediate filament proteins of mature astrocytes. It is used as a marker to distinguish astrocytes from other glial cells during development. Mutations in this gene cause Alexander disease, a rare disorder of astrocytes in the central nervous system. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | ENSG00000131095 | glial fibrillary acidic protein | NA |
| 4014 | LOR | This gene encodes loricrin, a major protein component of the cornified cell envelope found in terminally differentiated epidermal cells. Mutations in this gene are associated with Vohwinkel’s syndrome and progressive symmetric erythrokeratoderma, both inherited skin diseases. | ENSG00000203782 | loricrin | NA |
| 7178 | TPT1 | NA | ENSG00000133112 | tumor protein, translationally-controlled 1 | NA |
| 60 | ACTB | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | ENSG00000075624 | actin, beta | NA |
| 1471 | CST3 | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins and the kininogens. The type 2 cystatin proteins are a class of cysteine proteinase inhibitors found in a variety of human fluids and secretions, where they appear to provide protective functions. The cystatin locus on chromosome 20 contains the majority of the type 2 cystatin genes and pseudogenes. This gene is located in the cystatin locus and encodes the most abundant extracellular inhibitor of cysteine proteases, which is found in high concentrations in biological fluids and is expressed in virtually all organs of the body. A mutation in this gene has been associated with amyloid angiopathy. Expression of this protein in vascular wall smooth muscle cells is severely reduced in both atherosclerotic and aneurysmal aortic lesions, establishing its role in vascular disease. In addition, this protein has been shown to have an antimicrobial function, inhibiting the replication of herpes simplex virus. Alternative splicing results in multiple transcript variants encoding a single protein. | ENSG00000101439 | cystatin C | NA |
| 7314 | UBB | This gene encodes ubiquitin, one of the most conserved proteins known. Ubiquitin has a major role in targeting cellular proteins for degradation by the 26S proteosome. It is also involved in the maintenance of chromatin structure, the regulation of gene expression, and the stress response. Ubiquitin is synthesized as a precursor protein consisting of either polyubiquitin chains or a single ubiquitin moiety fused to an unrelated protein. This gene consists of three direct repeats of the ubiquitin coding sequence with no spacer sequence. Consequently, the protein is expressed as a polyubiquitin precursor with a final amino acid after the last repeat. An aberrant form of this protein has been detected in patients with Alzheimer’s disease and Down syndrome. Pseudogenes of this gene are located on chromosomes 1, 2, 13, and 17. Alternative splicing results in multiple transcript variants. | ENSG00000170315 | ubiquitin B | NA |
| 388533 | KRTDAP | This gene encodes a protein which may function in the regulation of keratinocyte differentiation and maintenance of stratified epithelia. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000188508 | keratinocyte differentiation associated protein | NA |
| 222166 | MTURN | NA | ENSG00000180354 | maturin, neural progenitor differentiation regulator homolog (Xenopus) | NA |
| 7169 | TPM2 | This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ENSG00000198467 | tropomyosin 2 (beta) | NA |
| 5317 | PKP1 | This gene encodes a member of the arm-repeat (armadillo) and plakophilin gene families. Plakophilin proteins contain numerous armadillo repeats, localize to cell desmosomes and nuclei, and participate in linking cadherins to intermediate filaments in the cytoskeleton. This protein may be involved in molecular recruitment and stabilization during desmosome formation. Mutations in this gene have been associated with the ectodermal dysplasia/skin fragility syndrome. Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000081277 | plakophilin 1 | NA |
| 682 | BSG | The protein encoded by this gene is a plasma membrane protein that is important in spermatogenesis, embryo implantation, neural network formation, and tumor progression. The encoded protein is also a member of the immunoglobulin superfamily. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000172270 | basigin (Ok blood group) | NA |
| 2335 | FN1 | This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | ENSG00000115414 | fibronectin 1 | NA |
| 117159 | DCD | This antimicrobial gene encodes a secreted protein that is subsequently processed into mature peptides of distinct biological activities. The C-terminal peptide is constitutively expressed in sweat and has antibacterial and antifungal activities. The N-terminal peptide, also known as diffusible survival evasion peptide, promotes neural cell survival under conditions of severe oxidative stress. A glycosylated form of the N-terminal peptide may be associated with cachexia (muscle wasting) in cancer patients. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000161634 | dermcidin | NA |
| 2023 | ENO1 | This gene encodes alpha-enolase, one of three enolase isoenzymes found in mammals. Each isoenzyme is a homodimer composed of 2 alpha, 2 gamma, or 2 beta subunits, and functions as a glycolytic enzyme. Alpha-enolase in addition, functions as a structural lens protein (tau-crystallin) in the monomeric form. Alternative splicing of this gene results in a shorter isoform that has been shown to bind to the c-myc promoter and function as a tumor suppressor. Several pseudogenes have been identified, including one on the long arm of chromosome 1. Alpha-enolase has also been identified as an autoantigen in Hashimoto encephalopathy. | ENSG00000074800 | enolase 1 | NA |
| 65108 | MARCKSL1 | This gene encodes a member of the myristoylated alanine-rich C-kinase substrate (MARCKS) family. Members of this family play a role in cytoskeletal regulation, protein kinase C signaling and calmodulin signaling. The encoded protein affects the formation of adherens junction. Alternative splicing results in multiple transcript variants. Pseudogenes of this gene are located on the long arm of chromosomes 6 and 10. | ENSG00000175130 | MARCKS like 1 | NA |
| 6707 | SPRR3 | NA | ENSG00000163209 | small proline rich protein 3 | NA |
| 10409 | BASP1 | This gene encodes a membrane bound protein with several transient phosphorylation sites and PEST motifs. Conservation of proteins with PEST sequences among different species supports their functional significance. PEST sequences typically occur in proteins with high turnover rates. Immunological characteristics of this protein are species specific. This protein also undergoes N-terminal myristoylation. Alternative splicing results in multiple transcript variants that encode the same protein. | ENSG00000176788 | brain abundant membrane attached signal protein 1 | NA |
| 2879 | GPX4 | This gene encodes a member of the glutathione peroxidase protein family. Glutathione peroxidase catalyzes the reduction of hydrogen peroxide, organic hydroperoxide, and lipid peroxides by reduced glutathione and functions in the protection of cells against oxidative damage. Human plasma glutathione peroxidase has been shown to be a selenium-containing enzyme and the UGA codon is translated into a selenocysteine. The encoded protein has been identified as a moonlighting protein based on its ability to serve dual functions as a peroxidase as well as a structural protein in mature spermatozoa. Through alternative splicing and transcription initiation, rat produces proteins that localize to the nucleus, mitochondrion, and cytoplasm. In humans, alternative transcription initiation and the cleavage sites of the mitochondrial and nuclear transit peptides need to be experimentally verified. Alternative splicing results in multiple transcript variants. | ENSG00000167468 | glutathione peroxidase 4 | NA |
| 2778 | GNAS | This locus has a highly complex imprinted expression pattern. It gives rise to maternally, paternally, and biallelically expressed transcripts that are derived from four alternative promoters and 5’ exons. Some transcripts contain a differentially methylated region (DMR) at their 5’ exons, and this DMR is commonly found in imprinted genes and correlates with transcript expression. An antisense transcript is produced from an overlapping locus on the opposite strand. One of the transcripts produced from this locus, and the antisense transcript, are paternally expressed noncoding RNAs, and may regulate imprinting in this region. In addition, one of the transcripts contains a second overlapping ORF, which encodes a structurally unrelated protein - Alex. Alternative splicing of downstream exons is also observed, which results in different forms of the stimulatory G-protein alpha subunit, a key element of the classical signal transduction pathway linking receptor-ligand interactions with the activation of adenylyl cyclase and a variety of cellular reponses. Multiple transcript variants encoding different isoforms have been found for this gene. Mutations in this gene result in pseudohypoparathyroidism type 1a, pseudohypoparathyroidism type 1b, Albright hereditary osteodystrophy, pseudopseudohypoparathyroidism, McCune-Albright syndrome, progressive osseus heteroplasia, polyostotic fibrous dysplasia of bone, and some pituitary tumors. | ENSG00000087460 | GNAS complex locus | NA |
| 3691 | ITGB4 | Integrins are heterodimers comprised of alpha and beta subunits, that are noncovalently associated transmembrane glycoprotein receptors. Different combinations of alpha and beta polypeptides form complexes that vary in their ligand-binding specificities. Integrins mediate cell-matrix or cell-cell adhesion, and transduced signals that regulate gene expression and cell growth. This gene encodes the integrin beta 4 subunit, a receptor for the laminins. This subunit tends to associate with alpha 6 subunit and is likely to play a pivotal role in the biology of invasive carcinoma. Mutations in this gene are associated with epidermolysis bullosa with pyloric atresia. Multiple alternatively spliced transcript variants encoding distinct isoforms have been found for this gene. | ENSG00000132470 | integrin subunit beta 4 | NA |
| 5660 | PSAP | This gene encodes a highly conserved preproprotein that is proteolytically processed to generate four main cleavage products including saposins A, B, C, and D. Each domain of the precursor protein is approximately 80 amino acid residues long with nearly identical placement of cysteine residues and glycosylation sites. Saposins A-D localize primarily to the lysosomal compartment where they facilitate the catabolism of glycosphingolipids with short oligosaccharide groups. The precursor protein exists both as a secretory protein and as an integral membrane protein and has neurotrophic activities. Mutations in this gene have been associated with Gaucher disease and metachromatic leukodystrophy. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that is proteolytically processed. | ENSG00000197746 | prosaposin | NA |
| 27122 | DKK3 | This gene encodes a protein that is a member of the dickkopf family. The secreted protein contains two cysteine rich regions and is involved in embryonic development through its interactions with the Wnt signaling pathway. The expression of this gene is decreased in a variety of cancer cell lines and it may function as a tumor suppressor gene. Alternative splicing results in multiple transcript variants encoding the same protein. | ENSG00000050165 | dickkopf WNT signaling pathway inhibitor 3 | NA |
| 6176 | RPLP1 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal phosphoprotein that is a component of the 60S subunit. The protein, which is a functional equivalent of the E. coli L7/L12 ribosomal protein, belongs to the L12P family of ribosomal proteins. It plays an important role in the elongation step of protein synthesis. Unlike most ribosomal proteins, which are basic, the encoded protein is acidic. Its C-terminal end is nearly identical to the C-terminal ends of the ribosomal phosphoproteins P0 and P2. The P1 protein can interact with P0 and P2 to form a pentameric complex consisting of P1 and P2 dimers, and a P0 monomer. The protein is located in the cytoplasm. Two alternatively spliced transcript variants that encode different proteins have been observed. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | ENSG00000137818 | ribosomal protein lateral stalk subunit P1 | NA |
| 9572 | NR1D1 | This gene encodes a transcription factor that is a member of the nuclear receptor subfamily 1. The encoded protein is a ligand-sensitive transcription factor that negatively regulates the expression of core clock proteins. In particular this protein represses the circadian clock transcription factor aryl hydrocarbon receptor nuclear translocator-like protein 1 (ARNTL). This protein may also be involved in regulating genes that function in metabolic, inflammatory and cardiovascular processes. | ENSG00000126368 | nuclear receptor subfamily 1 group D member 1 | NA |
| 1277 | COL1A1 | This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIA, Ehlers-Danlos syndrome Classical type, Caffey Disease and idiopathic osteoporosis. Reciprocal translocations between chromosomes 17 and 22, where this gene and the gene for platelet-derived growth factor beta are located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans, resulting from unregulated expression of the growth factor. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | ENSG00000108821 | collagen type I alpha 1 | NA |
| 23650 | TRIM29 | The protein encoded by this gene belongs to the TRIM protein family. It has multiple zinc finger motifs and a leucine zipper motif. It has been proposed to form homo- or heterodimers which are involved in nucleic acid binding. Thus, it may act as a transcriptional regulatory factor involved in carcinogenesis and/or differentiation. It may also function in the suppression of radiosensitivity since it is associated with ataxia telangiectasia phenotype. | ENSG00000137699 | tripartite motif containing 29 | NA |
| 1675 | CFD | This gene encodes a member of the S1, or chymotrypsin, family of serine peptidases. This protease catalyzes the cleavage of factor B, the rate-limiting step of the alternative pathway of complement activation. This protein also functions as an adipokine, a cell signaling protein secreted by adipocytes, which regulates insulin secretion in mice. Mutations in this gene underlie complement factor D deficiency, which is associated with recurrent bacterial meningitis infections in human patients. Alternative splicing of this gene results in multiple transcript variants. At least one of these variants encodes a preproprotein that is proteolytically processed to generate the mature protease. | ENSG00000197766 | complement factor D | NA |
| 1281 | COL3A1 | This gene encodes the pro-alpha1 chains of type III collagen, a fibrillar collagen that is found in extensible connective tissues such as skin, lung, uterus, intestine and the vascular system, frequently in association with type I collagen. Mutations in this gene are associated with Ehlers-Danlos syndrome types IV, and with aortic and arterial aneurysms. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | ENSG00000168542 | collagen type III alpha 1 chain | NA |
| 348 | APOE | The protein encoded by this gene is a major apoprotein of the chylomicron. It binds to a specific liver and peripheral cell receptor, and is essential for the normal catabolism of triglyceride-rich lipoprotein constituents. This gene maps to chromosome 19 in a cluster with the related apolipoprotein C1 and C2 genes. Mutations in this gene result in familial dysbetalipoproteinemia, or type III hyperlipoproteinemia (HLP III), in which increased plasma cholesterol and triglycerides are the consequence of impaired clearance of chylomicron and VLDL remnants. Alternative splicing results in multiple transcript variants. | ENSG00000130203 | apolipoprotein E | NA |
| 7018 | TF | This gene encodes a glycoprotein with an approximate molecular weight of 76.5 kDa. It is thought to have been created as a result of an ancient gene duplication event that led to generation of homologous C and N-terminal domains each of which binds one ion of ferric iron. The function of this protein is to transport iron from the intestine, reticuloendothelial system, and liver parenchymal cells to all proliferating cells in the body. This protein may also have a physiologic role as granulocyte/pollen-binding protein (GPBP) involved in the removal of certain organic matter and allergens from serum. | ENSG00000091513 | transferrin | NA |
| 7145 | TNS1 | The protein encoded by this gene localizes to focal adhesions, regions of the plasma membrane where the cell attaches to the extracellular matrix. This protein crosslinks actin filaments and contains a Src homology 2 (SH2) domain, which is often found in molecules involved in signal transduction. This protein is a substrate of calpain II. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000079308 | tensin 1 | NA |
| 150094 | SIK1 | NA | ENSG00000142178 | salt inducible kinase 1 | NA |
| ENSG00000229732 | AC019349.5 | NA | ENSG00000229732 | NA | NA |
| 2934 | GSN | The protein encoded by this gene binds to the ‘plus’ ends of actin monomers and filaments to prevent monomer exchange. The encoded calcium-regulated protein functions in both assembly and disassembly of actin filaments. Defects in this gene are a cause of familial amyloidosis Finnish type (FAF). Multiple transcript variants encoding several different isoforms have been found for this gene. | ENSG00000148180 | gelsolin | NA |
| 57699 | CPNE5 | Calcium-dependent membrane-binding proteins may regulate molecular events at the interface of the cell membrane and cytoplasm. This gene is one of several genes that encode a calcium-dependent protein containing two N-terminal type II C2 domains and an integrin A domain-like sequence in the C-terminus. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene. More variants may exist, but their full-length natures could not be determined. | ENSG00000124772 | copine 5 | NA |
| 28996 | HIPK2 | This gene encodes a conserved serine/threonine kinase that is a member of the homeodomain-interacting protein kinase family. The encoded protein interacts with homeodomain transcription factors and many other transcription factors such as p53, and can function as both a corepressor and a coactivator depending on the transcription factor and its subcellular localization. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000064393 | homeodomain interacting protein kinase 2 | NA |
| 55076 | TMEM45A | NA | ENSG00000181458 | transmembrane protein 45A | NA |
| 604 | BCL6 | The protein encoded by this gene is a zinc finger transcription factor and contains an N-terminal POZ domain. This protein acts as a sequence-specific repressor of transcription, and has been shown to modulate the transcription of STAT-dependent IL-4 responses of B cells. This protein can interact with a variety of POZ-containing proteins that function as transcription corepressors. This gene is found to be frequently translocated and hypermutated in diffuse large-cell lymphoma (DLCL), and may be involved in the pathogenesis of DLCL. Alternatively spliced transcript variants encoding different protein isoforms have been found for this gene. | ENSG00000113916 | B-cell CLL/lymphoma 6 | NA |
| 171024 | SYNPO2 | NA | ENSG00000172403 | synaptopodin 2 | NA |
| ENSG00000265401 | RP11-138I1.4 | NA | ENSG00000265401 | NA | NA |
| 79957 | PAQR6 | NA | ENSG00000160781 | progestin and adipoQ receptor family member 6 | NA |
| 2495 | FTH1 | This gene encodes the heavy subunit of ferritin, the major intracellular iron storage protein in prokaryotes and eukaryotes. It is composed of 24 subunits of the heavy and light ferritin chains. Variation in ferritin subunit composition may affect the rates of iron uptake and release in different tissues. A major function of ferritin is the storage of iron in a soluble and nontoxic state. Defects in ferritin proteins are associated with several neurodegenerative diseases. This gene has multiple pseudogenes. Several alternatively spliced transcript variants have been observed, but their biological validity has not been determined. | ENSG00000167996 | ferritin heavy chain 1 | NA |
| 151516 | ASPRV1 | NA | ENSG00000244617 | aspartic peptidase, retroviral-like 1 | NA |
| 6095 | RORA | The protein encoded by this gene is a member of the NR1 subfamily of nuclear hormone receptors. It can bind as a monomer or as a homodimer to hormone response elements upstream of several genes to enhance the expression of those genes. The encoded protein has been shown to interact with NM23-2, a nucleoside diphosphate kinase involved in organogenesis and differentiation, as well as with NM23-1, the product of a tumor metastasis suppressor candidate gene. Also, it has been shown to aid in the transcriptional regulation of some genes involved in circadian rhythm. Four transcript variants encoding different isoforms have been described for this gene. | ENSG00000069667 | RAR related orphan receptor A | NA |
| 6175 | RPLP0 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein, which is the functional equivalent of the E. coli L10 ribosomal protein, belongs to the L10P family of ribosomal proteins. It is a neutral phosphoprotein with a C-terminal end that is nearly identical to the C-terminal ends of the acidic ribosomal phosphoproteins P1 and P2. The P0 protein can interact with P1 and P2 to form a pentameric complex consisting of P1 and P2 dimers, and a P0 monomer. The protein is located in the cytoplasm. Transcript variants derived from alternative splicing exist; they encode the same protein. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | ENSG00000089157 | ribosomal protein lateral stalk subunit P0 | NA |
| 1293 | COL6A3 | This gene encodes the alpha-3 chain, one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The alpha-3 chain of type VI collagen is much larger than the alpha-1 and -2 chains. This difference in size is largely due to an increase in the number of subdomains, similar to von Willebrand Factor type A domains, that are found in the amino terminal globular domain of all the alpha chains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in the type VI collagen genes are associated with Bethlem myopathy, a rare autosomal dominant proximal myopathy with early childhood onset. Mutations in this gene are also a cause of Ullrich congenital muscular dystrophy, also referred to as Ullrich scleroatonic muscular dystrophy, an autosomal recessive congenital myopathy that is more severe than Bethlem myopathy. Multiple transcript variants have been identified, but the full-length nature of only some of these variants has been described. | ENSG00000163359 | collagen type VI alpha 3 chain | NA |
| 79026 | AHNAK | NA | ENSG00000124942 | AHNAK nucleoprotein | NA |
| NA | NA | NA | ENSG00000117289 | NA | TRUE |
| 6280 | S100A9 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and altered expression of this protein is associated with the disease cystic fibrosis. This antimicrobial protein exhibits antifungal and antibacterial activity. | ENSG00000163220 | S100 calcium binding protein A9 | NA |
| 49860 | CRNN | This gene encodes a member of the ‘fused gene’ family of proteins, which contain N-terminus EF-hand domains and multiple tandem peptide repeats. The encoded protein contains two EF-hand Ca2+ binding domains in its N-terminus and two glutamine- and threonine-rich 60 amino acid repeats in its C-terminus. This gene, also known as squamous epithelial heat shock protein 53, may play a role in the mucosal/epithelial immune response and epidermal differentiation. | ENSG00000143536 | cornulin | NA |
| 6440 | SFTPC | This gene encodes the pulmonary-associated surfactant protein C (SPC), an extremely hydrophobic surfactant protein essential for lung function and homeostasis after birth. Pulmonary surfactant is a surface-active lipoprotein complex composed of 90% lipids and 10% proteins which include plasma proteins and apolipoproteins SPA, SPB, SPC and SPD. The surfactant is secreted by the alveolar cells of the lung and maintains the stability of pulmonary tissue by reducing the surface tension of fluids that coat the lung. Multiple mutations in this gene have been identified, which cause pulmonary surfactant metabolism dysfunction type 2, also called pulmonary alveolar proteinosis due to surfactant protein C deficiency, and are associated with interstitial lung disease in older infants, children, and adults. Alternatively spliced transcript variants encoding different protein isoforms have been identified. | ENSG00000168484 | surfactant protein C | NA |
| 125 | ADH1B | The protein encoded by this gene is a member of the alcohol dehydrogenase family. Members of this enzyme family metabolize a wide variety of substrates, including ethanol, retinol, other aliphatic alcohols, hydroxysteroids, and lipid peroxidation products. This encoded protein, consisting of several homo- and heterodimers of alpha, beta, and gamma subunits, exhibits high activity for ethanol oxidation and plays a major role in ethanol catabolism. Three genes encoding alpha, beta and gamma subunits are tandemly organized in a genomic segment as a gene cluster. Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000196616 | alcohol dehydrogenase 1B (class I), beta polypeptide | NA |
| 3728 | JUP | This gene encodes a major cytoplasmic protein which is the only known constituent common to submembranous plaques of both desmosomes and intermediate junctions. This protein forms distinct complexes with cadherins and desmosomal cadherins and is a member of the catenin family since it contains a distinct repeating amino acid motif called the armadillo repeat. Mutation in this gene has been associated with Naxos disease. Alternative splicing occurs in this gene; however, not all transcripts have been fully described. | ENSG00000173801 | junction plakoglobin | NA |
| 11067 | C10orf10 | The expression of this gene is induced by fasting as well as by progesterone. The protein encoded by this gene contains a t-synaptosome-associated protein receptor (SNARE) coiled-coil homology domain and a peroxisomal targeting signal. Production of the encoded protein leads to phosphorylation and activation of the transcription factor ELK1. | ENSG00000165507 | chromosome 10 open reading frame 10 | NA |
| 9638 | FEZ1 | This gene is an ortholog of the C. elegans unc-76 gene, which is necessary for normal axonal bundling and elongation within axon bundles. Expression of this gene in C. elegans unc-76 mutants can restore to the mutants partial locomotion and axonal fasciculation, suggesting that it also functions in axonal outgrowth. The N-terminal half of the gene product is highly acidic. Alternatively spliced transcript variants encoding different isoforms of this protein have been described. | ENSG00000149557 | fasciculation and elongation protein zeta 1 | NA |
| 8507 | ENC1 | This gene encodes a member of the kelch-related family of actin-binding proteins. The encoded protein plays a role in the oxidative stress response as a regulator of the transcription factor Nrf2, and expression of this gene may play a role in malignant transformation. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | ENSG00000171617 | ectodermal-neural cortex 1 | NA |
| 114907 | FBXO32 | This gene encodes a member of the F-box protein family which is characterized by an approximately 40 amino acid motif, the F-box. The F-box proteins constitute one of the four subunits of the ubiquitin protein ligase complex called SCFs (SKP1-cullin-F-box), which function in phosphorylation-dependent ubiquitination. The F-box proteins are divided into 3 classes: Fbws containing WD-40 domains, Fbls containing leucine-rich repeats, and Fbxs containing either different protein-protein interaction modules or no recognizable motifs. The protein encoded by this gene belongs to the Fbxs class and contains an F-box domain. This protein is highly expressed during muscle atrophy, whereas mice deficient in this gene were found to be resistant to atrophy. This protein is thus a potential drug target for the treatment of muscle atrophy. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000156804 | F-box protein 32 | NA |
| 81691 | LOC81691 | NA | ENSG00000005189 | exonuclease NEF-sp | NA |
| 3315 | HSPB1 | The protein encoded by this gene is induced by environmental stress and developmental changes. The encoded protein is involved in stress resistance and actin organization and translocates from the cytoplasm to the nucleus upon stress induction. Defects in this gene are a cause of Charcot-Marie-Tooth disease type 2F (CMT2F) and distal hereditary motor neuropathy (dHMN). | ENSG00000106211 | heat shock protein family B (small) member 1 | NA |
| 146225 | CMTM2 | This gene belongs to the chemokine-like factor gene superfamily, a novel family that links the chemokine and the transmembrane 4 superfamilies of signaling molecules. The protein encoded by this gene may play an important role in testicular development. | ENSG00000140932 | CKLF like MARVEL transmembrane domain containing 2 | NA |
| 6277 | S100A6 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in stimulation of Ca2+-dependent insulin release, stimulation of prolactin secretion, and exocytosis. Chromosomal rearrangements and altered expression of this gene have been implicated in melanoma. | ENSG00000197956 | S100 calcium binding protein A6 | NA |
| 1278 | COL1A2 | This gene encodes the pro-alpha2 chain of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIB, recessive Ehlers-Danlos syndrome Classical type, idiopathic osteoporosis, and atypical Marfan syndrome. Symptoms associated with mutations in this gene, however, tend to be less severe than mutations in the gene for the alpha1 chain of type I collagen (COL1A1) reflecting the different role of alpha2 chains in matrix integrity. Three transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | ENSG00000164692 | collagen type I alpha 2 chain | NA |
| 7316 | UBC | This gene represents a ubiquitin gene, ubiquitin C. The encoded protein is a polyubiquitin precursor. Conjugation of ubiquitin monomers or polymers can lead to various effects within a cell, depending on the residues to which ubiquitin is conjugated. Ubiquitination has been associated with protein degradation, DNA repair, cell cycle regulation, kinase modification, endocytosis, and regulation of other cell signaling pathways. | ENSG00000150991 | ubiquitin C | NA |
| 4666 | NACA | This gene encodes a protein that associates with basic transcription factor 3 (BTF3) to form the nascent polypeptide-associated complex (NAC). This complex binds to nascent proteins that lack a signal peptide motif as they emerge from the ribosome, blocking interaction with the signal recognition particle (SRP) and preventing mistranslocation to the endoplasmic reticulum. This protein is an IgE autoantigen in atopic dermatitis patients. Alternative splicing results in multiple transcript variants, but the full length nature of some of these variants, including those encoding very large proteins, has not been determined. There are multiple pseudogenes of this gene on different chromosomes. | ENSG00000196531 | nascent polypeptide-associated complex alpha subunit | NA |
| 7070 | THY1 | This gene encodes a cell surface glycoprotein and member of the immunoglobulin superfamily of proteins. The encoded protein is involved in cell adhesion and cell communication in numerous cell types, but particularly in cells of the immune and nervous systems. The encoded protein is widely used as a marker for hematopoietic stem cells. This gene may function as a tumor suppressor in nasopharyngeal carcinoma. Alternative splicing results in multiple transcript variants. | ENSG00000154096 | Thy-1 cell surface antigen | NA |
| 58476 | TP53INP2 | NA | ENSG00000078804 | tumor protein p53 inducible nuclear protein 2 | NA |
| 2261 | FGFR3 | This gene encodes a member of the fibroblast growth factor receptor (FGFR) family, with its amino acid sequence being highly conserved between members and among divergent species. FGFR family members differ from one another in their ligand affinities and tissue distribution. A full-length representative protein would consist of an extracellular region, composed of three immunoglobulin-like domains, a single hydrophobic membrane-spanning segment and a cytoplasmic tyrosine kinase domain. The extracellular portion of the protein interacts with fibroblast growth factors, setting in motion a cascade of downstream signals, ultimately influencing mitogenesis and differentiation. This particular family member binds acidic and basic fibroblast growth hormone and plays a role in bone development and maintenance. Mutations in this gene lead to craniosynostosis and multiple types of skeletal dysplasia. Three alternatively spliced transcript variants that encode different protein isoforms have been described. | ENSG00000068078 | fibroblast growth factor receptor 3 | NA |
| 2355 | FOSL2 | The Fos gene family consists of 4 members: FOS, FOSB, FOSL1, and FOSL2. These genes encode leucine zipper proteins that can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1. As such, the FOS proteins have been implicated as regulators of cell proliferation, differentiation, and transformation. | ENSG00000075426 | FOS like 2, AP-1 transcription factor subunit | NA |
| 23089 | PEG10 | This is a paternally expressed imprinted gene that is thought to have been derived from the Ty3/Gypsy family of retrotransposons. It contains two overlapping open reading frames, RF1 and RF2, and expresses two proteins: a shorter, gag-like protein (with a CCHC-type zinc finger domain) from RF1; and a longer, gag/pol-like fusion protein (with an additional aspartic protease motif) from RF1/RF2 by -1 translational frameshifting (-1 FS). While -1 FS has been observed in RNA viruses and transposons in both prokaryotes and eukaryotes, this gene represents the first example of -1 FS in a eukaryotic cellular gene. This gene is highly conserved across mammalian species and retains the heptanucleotide (GGGAAAC) and pseudoknot elements required for -1 FS. It is expressed in adult and embryonic tissues (most notably in placenta) and reported to have a role in cell proliferation, differentiation and apoptosis. Overexpression of this gene has been associated with several malignancies, such as hepatocellular carcinoma and B-cell lymphocytic leukemia. Knockout mice lacking this gene showed early embryonic lethality with placental defects, indicating the importance of this gene in embryonic development. Additional isoforms resulting from alternatively spliced transcript variants, and use of upstream non-AUG (CUG) start codon have been reported for this gene. | ENSG00000242265 | paternally expressed 10 | NA |
| 5339 | PLEC | Plectin is a prominent member of an important family of structurally and in part functionally related proteins, termed plakins or cytolinkers, that are capable of interlinking different elements of the cytoskeleton. Plakins, with their multi-domain structure and enormous size, not only play crucial roles in maintaining cell and tissue integrity and orchestrating dynamic changes in cytoarchitecture and cell shape, but also serve as scaffolding platforms for the assembly, positioning, and regulation of signaling complexes (reviewed in PMID: 9701547, 11854008, and 17499243). Plectin is expressed as several protein isoforms in a wide range of cell types and tissues from a single gene located on chromosome 8 in humans (PMID: 8633055, 8698233). Until 2010, this locus was named plectin 1 (symbol PLEC1 in human; Plec1 in mouse and rat) and the gene product had been referred to as ‘hemidesmosomal protein 1’ or ‘plectin 1, intermediate filament binding 500kDa’. These names were superseded by plectin. The plectin gene locus in mouse on chromosome 15 has been analyzed in detail (PMID: 10556294, 14559777), revealing a genomic exon-intron organization with well over 40 exons spanning over 62 kb and an unusual 5’ transcript complexity of plectin isoforms. Eleven exons (1-1j) have been identified that alternatively splice directly into a common exon 2 which is the first exon to encode plectin’s highly conserved actin binding domain (ABD). Three additional exons (-1, 0a, and 0) splice into an alternative first coding exon (1c), and two additional exons (2alpha and 3alpha) are optionally spliced within the exons encoding the acting binding domain (exons 2-8). Analysis of the human locus has identified eight of the eleven alternative 5’ exons found in mouse and rat (PMID: 14672974); exons 1i, 1j and 1h have not been confirmed in human. Furthermore, isoforms lacking the central rod domain encoded by exon 31 have been detected in mouse (PMID:10556294), rat (PMID: 9177781), and human (PMID: 11441066, 10780662, 20052759). The short alternative amino-terminal sequences encoded by the different first exons direct the targeting of the various isoforms to distinct subcellular locations (PMID: 14559777). As the expression of specific plectin isoforms was found to be dependent on cell type (tissue) and stage of development (PMID: 10556294, 12542521, 17389230) it appears that each cell type (tissue) contains a unique set (proportion and composition) of plectin isoforms, as if custom-made for specific requirements of the particular cells. Concordantly, individual isoforms were found to carry out distinct and specific functions (PMID: 14559777, 12542521, 18541706). In 1996, a number of groups reported that patients suffering from epidermolysis bullosa simplex with muscular dystrophy (EBS-MD) lacked plectin expression in skin and muscle tissues due to defects in the plectin gene (PMID: 8698233, 8941634, 8636409, 8894687, 8696340). Two other subtypes of plectin-related EBS have been described: EBS-pyloric atresia (PA) and EBS-Ogna. For reviews of plectin-related diseases see PMID: 15810881, 19945614. Mutations in the plectin gene related to human diseases should be named based on the position in NM_000445 (variant 1, isoform 1c), unless the mutation is located within one of the other alternative first exons, in which case the position in the respective Reference Sequence should be used. | ENSG00000178209 | plectin | NA |
| 8407 | TAGLN2 | The protein encoded by this gene is similar to the protein transgelin, which is one of the earliest markers of differentiated smooth muscle. The specific function of this protein has not yet been determined, although it is thought to be a tumor suppressor. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000158710 | transgelin 2 | NA |
| 27129 | HSPB7 | NA | ENSG00000173641 | heat shock protein family B (small) member 7 | NA |
| 80740 | LY6G6C | LY6G6C belongs to a cluster of leukocyte antigen-6 (LY6) genes located in the major histocompatibility complex (MHC) class III region on chromosome 6. Members of the LY6 superfamily typically contain 70 to 80 amino acids, including 8 to 10 cysteines. Most LY6 proteins are attached to the cell surface by a glycosylphosphatidylinositol (GPI) anchor that is directly involved in signal transduction (Mallya et al., 2002 [PubMed 12079290]). | ENSG00000204421 | lymphocyte antigen 6 complex, locus G6C | NA |
| 1832 | DSP | This gene encodes a protein that anchors intermediate filaments to desmosomal plaques and forms an obligate component of functional desmosomes. Mutations in this gene are the cause of several cardiomyopathies and keratodermas, including skin fragility-woolly hair syndrome. Alternative splicing results in multiple transcript variants. | ENSG00000096696 | desmoplakin | NA |
| 3853 | KRT6A | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. As many as six of this type II cytokeratin (KRT6) have been identified; the multiplicity of the genes is attributed to successive gene duplication events. The genes are expressed with family members KRT16 and/or KRT17 in the filiform papillae of the tongue, the stratified epithelial lining of oral mucosa and esophagus, the outer root sheath of hair follicles, and the glandular epithelia. This KRT6 gene in particular encodes the most abundant isoform. Mutations in these genes have been associated with pachyonychia congenita. In addition, peptides from the C-terminal region of the protein have antimicrobial activity against bacterial pathogens. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | ENSG00000205420 | keratin 6A | NA |
| 729238 | SFTPA2 | This gene is one of several genes encoding pulmonary-surfactant associated proteins (SFTPA) located on chromosome 10. Mutations in this gene and a highly similar gene located nearby, which affect the highly conserved carbohydrate recognition domain, are associated with idiopathic pulmonary fibrosis. The current version of the assembly displays only a single centromeric SFTPA gene pair rather than the two gene pairs shown in the previous assembly which were thought to have resulted from a duplication. | ENSG00000185303 | surfactant protein A2 | NA |
| 5187 | PER1 | This gene is a member of the Period family of genes and is expressed in a circadian pattern in the suprachiasmatic nucleus, the primary circadian pacemaker in the mammalian brain. Genes in this family encode components of the circadian rhythms of locomotor activity, metabolism, and behavior. This gene is upregulated by CLOCK/ARNTL heterodimers but then represses this upregulation in a feedback loop using PER/CRY heterodimers to interact with CLOCK/ARNTL. Polymorphisms in this gene may increase the risk of getting certain cancers. Alternative splicing has been observed in this gene; however, these variants have not been fully described. | ENSG00000179094 | period circadian clock 1 | NA |
| 5524 | PTPA | Protein phosphatase 2A is one of the four major Ser/Thr phosphatases and is implicated in the negative control of cell growth and division. Protein phosphatase 2A holoenzymes are heterotrimeric proteins composed of a structural subunit A, a catalytic subunit C, and a regulatory subunit B. The regulatory subunit is encoded by a diverse set of genes that have been grouped into the B/PR55, B’/PR61, and B’‘/PR72 families. These different regulatory subunits confer distinct enzymatic specificities and intracellular localizations to the holozenzyme. The product of this gene belongs to the B’ family. This gene encodes a specific phosphotyrosyl phosphatase activator of the dimeric form of protein phosphatase 2A. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000119383 | protein phosphatase 2 phosphatase activator | NA |
| 8751 | ADAM15 | The protein encoded by this gene is a member of the ADAM (a disintegrin and metalloproteinase) protein family. ADAM family members are type I transmembrane glycoproteins known to be involved in cell adhesion and proteolytic ectodomain processing of cytokines and adhesion molecules. This protein contains multiple functional domains including a zinc-binding metalloprotease domain, a disintegrin-like domain, as well as a EGF-like domain. Through its disintegrin-like domain, this protein specifically interacts with the integrin beta chain, beta 3. It also interacts with Src family protein-tyrosine kinases in a phosphorylation-dependent manner, suggesting that this protein may function in cell-cell adhesion as well as in cellular signaling. Multiple alternatively spliced transcript variants encoding distinct isoforms have been observed. | ENSG00000143537 | ADAM metallopeptidase domain 15 | NA |
| 8848 | TSC22D1 | This gene encodes a member of the TSC22 domain family of leucine zipper transcription factors. The encoded protein is stimulated by transforming growth factor beta, and regulates the transcription of multiple genes including C-type natriuretic peptide. The encoded protein may play a critical role in tumor suppression through the induction of cancer cell apoptosis, and a single nucleotide polymorphism in the promoter of this gene has been associated with diabetic nephropathy. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | ENSG00000102804 | TSC22 domain family member 1 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_sqrt/gene_names_clus_",20,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
lambda_out <- read.table("../sfa_outputs/GTEX2013/voom_gtex/voom_gtex_sfa_lambda.out");
f_out <- t(read.table("../sfa_outputs/GTEX2013/voom_gtex/voom_gtex_sfa_F.out"));
gene_names <- as.vector(as.matrix(read.table("../sfa_inputs/gene_names_GTEX_V6.txt")));
gene_names <- substring(gene_names,1,15);
xli <- gene_names;
indices_mat <- SFA.ExtractTopFeatures(f_out, top_features = 100, options="min", mult.annotate = TRUE)
gene_list <- do.call(rbind, lapply(1:dim(indices_mat)[1], function(x) gene_names[indices_mat[x,]]))
out <- mygene::queryMany(gene_list[1,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| query | X_id | name | symbol | summary | notfound |
|---|---|---|---|---|---|
| ENSG00000205362 | 4489 | metallothionein 1A | MT1A | NA | NA |
| ENSG00000116285 | 54206 | ERBB receptor feedback inhibitor 1 | ERRFI1 | ERRFI1 is a cytoplasmic protein whose expression is upregulated with cell growth (Wick et al., 1995 [PubMed 7641805]). It shares significant homology with the protein product of rat gene-33, which is induced during cell stress and mediates cell signaling (Makkinje et al., 2000 [PubMed 10749885]; Fiorentino et al., 2000 [PubMed 11003669]). | NA |
| ENSG00000136997 | 4609 | v-myc avian myelocytomatosis viral oncogene homolog | MYC | The protein encoded by this gene is a multifunctional, nuclear phosphoprotein that plays a role in cell cycle progression, apoptosis and cellular transformation. It functions as a transcription factor that regulates transcription of specific target genes. Mutations, overexpression, rearrangement and translocation of this gene have been associated with a variety of hematopoietic tumors, leukemias and lymphomas, including Burkitt lymphoma. There is evidence to show that alternative translation initiations from an upstream, in-frame non-AUG (CUG) and a downstream AUG start site result in the production of two isoforms with distinct N-termini. The synthesis of non-AUG initiated protein is suppressed in Burkitt’s lymphomas, suggesting its importance in the normal function of this gene. | NA |
| ENSG00000081041 | 2920 | C-X-C motif chemokine ligand 2 | CXCL2 | This antimicrobial gene is part of a chemokine superfamily that encodes secreted proteins involved in immunoregulatory and inflammatory processes. The superfamily is divided into four subfamilies based on the arrangement of the N-terminal cysteine residues of the mature peptide. This chemokine, a member of the CXC subfamily, is expressed at sites of inflammation and may suppress hematopoietic progenitor cell proliferation. | NA |
| ENSG00000179751 | 342898 | syncollin | SYCN | NA | NA |
| ENSG00000145506 | 85409 | naked cuticle homolog 2 | NKD2 | This gene encodes a member of a family of proteins that function as negative regulators of Wnt receptor signaling through interaction with Dishevelled family members. The encoded protein participates in the delivery of transforming growth factor alpha-containing vesicles to the cell membrane. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| ENSG00000125740 | 2354 | FosB proto-oncogene, AP-1 transcription factor subunit | FOSB | The Fos gene family consists of 4 members: FOS, FOSB, FOSL1, and FOSL2. These genes encode leucine zipper proteins that can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1. As such, the FOS proteins have been implicated as regulators of cell proliferation, differentiation, and transformation. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| ENSG00000175592 | 8061 | FOS like 1, AP-1 transcription factor subunit | FOSL1 | The Fos gene family consists of 4 members: FOS, FOSB, FOSL1, and FOSL2. These genes encode leucine zipper proteins that can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1. As such, the FOS proteins have been implicated as regulators of cell proliferation, differentiation, and transformation. Several transcript variants encoding different isoforms have been found for this gene. | NA |
| ENSG00000117143 | 6675 | UDP-N-acetylglucosamine pyrophosphorylase 1 | UAP1 | NA | NA |
| ENSG00000155090 | 7071 | Kruppel like factor 10 | KLF10 | This gene encodes a member of a family of proteins that feature C2H2-type zinc finger domains. The encoded protein is a transcriptional repressor that acts as an effector of transforming growth factor beta signaling. Activity of this protein may inhibit the growth of cancers, particularly pancreatic cancer. Alternative splicing results in multiple transcript variants. | NA |
| ENSG00000179294 | NA | NA | NA | NA | TRUE |
| ENSG00000113739 | 8614 | stanniocalcin 2 | STC2 | This gene encodes a secreted, homodimeric glycoprotein that is expressed in a wide variety of tissues and may have autocrine or paracrine functions. The encoded protein has 10 of its 15 cysteine residues conserved among stanniocalcin family members and is phosphorylated by casein kinase 2 exclusively on its serine residues. Its C-terminus contains a cluster of histidine residues which may interact with metal ions. The protein may play a role in the regulation of renal and intestinal calcium and phosphate transport, cell metabolism, or cellular calcium/phosphate homeostasis. Constitutive overexpression of human stanniocalcin 2 in mice resulted in pre- and postnatal growth restriction, reduced bone and skeletal muscle growth, and organomegaly. Expression of this gene is induced by estrogen and altered in some breast cancers. | NA |
| ENSG00000119508 | 8013 | nuclear receptor subfamily 4 group A member 3 | NR4A3 | This gene encodes a member of the steroid-thyroid hormone-retinoid receptor superfamily. The encoded protein may act as a transcriptional activator. The protein can efficiently bind the NGFI-B Response Element (NBRE). Three different versions of extraskeletal myxoid chondrosarcomas (EMCs) are the result of reciprocal translocations between this gene and other genes. The translocation breakpoints are associated with Nuclear Receptor Subfamily 4, Group A, Member 3 (on chromosome 9) and either Ewing Sarcome Breakpoint Region 1 (on chromosome 22), RNA Polymerase II, TATA Box-Binding Protein-Associated Factor, 68-KD (on chromosome 17), or Transcription factor 12 (on chromosome 15). Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| ENSG00000124145 | 6385 | syndecan 4 | SDC4 | The protein encoded by this gene is a transmembrane (type I) heparan sulfate proteoglycan that functions as a receptor in intracellular signaling. The encoded protein is found as a homodimer and is a member of the syndecan proteoglycan family. This gene is found on chromosome 20, while a pseudogene has been found on chromosome 22. | NA |
| ENSG00000158516 | 1358 | carboxypeptidase A2 | CPA2 | Three different forms of human pancreatic procarboxypeptidase A have been isolated. The encoded protein represents the A2 form, which is a monomeric protein with different biochemical properties from the A1 and A3 forms. The A2 form of pancreatic procarboxypeptidase acts on aromatic C-terminal residues and is a secreted protein. | NA |
| ENSG00000020577 | 23034 | sterile alpha motif domain containing 4A | SAMD4A | Sterile alpha motifs (SAMs) in proteins such as SAMD4A are part of an RNA-binding domain that functions as a posttranscriptional regulator by binding to an RNA sequence motif known as the Smaug recognition element, which was named after the Drosophila Smaug protein (Baez and Boccaccio, 2005 [PubMed 16221671]). | NA |
| ENSG00000171621 | 80176 | splA/ryanodine receptor domain and SOCS box containing 1 | SPSB1 | NA | NA |
| ENSG00000164761 | 4982 | tumor necrosis factor receptor superfamily member 11b | TNFRSF11B | The protein encoded by this gene is a member of the TNF-receptor superfamily. This protein is an osteoblast-secreted decoy receptor that functions as a negative regulator of bone resorption. This protein specifically binds to its ligand, osteoprotegerin ligand, both of which are key extracellular regulators of osteoclast development. Studies of the mouse counterpart also suggest that this protein and its ligand play a role in lymph-node organogenesis and vascular calcification. Alternatively spliced transcript variants of this gene have been reported, but their full length nature has not been determined. | NA |
| ENSG00000163874 | 80149 | zinc finger CCCH-type containing 12A | ZC3H12A | ZC3H12A is an MCP1 (CCL2; MIM 158105)-induced protein that acts as a transcriptional activator and causes cell death of cardiomyocytes, possibly via induction of genes associated with apoptosis. | NA |
| ENSG00000103569 | 366 | aquaporin 9 | AQP9 | The aquaporins are a family of water-selective membrane channels. This gene encodes a member of a subset of aquaporins called the aquaglyceroporins. This protein allows passage of a broad range of noncharged solutes and also stimulates urea transport and osmotic water permeability. This protein may also facilitate the uptake of glycerol in hepatic tissue . The encoded protein may also play a role in specialized leukocyte functions such as immunological response and bactericidal activity. Alternate splicing results in multiple transcript variants. | NA |
| ENSG00000188522 | 644815 | family with sequence similarity 83 member G | FAM83G | NA | NA |
| ENSG00000144031 | 79998 | ankyrin repeat domain 53 | ANKRD53 | NA | NA |
| ENSG00000165732 | 9188 | DEAD-box helicase 21 | DDX21 | DEAD box proteins, characterized by the conserved motif Asp-Glu-Ala-Asp (DEAD), are putative RNA helicases. They are implicated in a number of cellular processes involving alteration of RNA secondary structure such as translation initiation, nuclear and mitochondrial splicing, and ribosome and spliceosome assembly. Based on their distribution patterns, some members of this family are believed to be involved in embryogenesis, spermatogenesis, and cellular growth and division. This gene encodes a DEAD box protein, which is an antigen recognized by autoimmune antibodies from a patient with watermelon stomach disease. This protein unwinds double-stranded RNA, folds single-stranded RNA, and may play important roles in ribosomal RNA biogenesis, RNA editing, RNA transport, and general transcription. | NA |
| ENSG00000060138 | 8531 | Y-box binding protein 3 | YBX3 | NA | NA |
| ENSG00000143196 | 1805 | dermatopontin | DPT | Dermatopontin is an extracellular matrix protein with possible functions in cell-matrix interactions and matrix assembly. The protein is found in various tissues and many of its tyrosine residues are sulphated. Dermatopontin is postulated to modify the behavior of TGF-beta through interaction with decorin. | NA |
| ENSG00000110104 | 79080 | coiled-coil domain containing 86 | CCDC86 | NA | NA |
| ENSG00000132329 | 10267 | receptor activity modifying protein 1 | RAMP1 | The protein encoded by this gene is a member of the RAMP family of single-transmembrane-domain proteins, called receptor (calcitonin) activity modifying proteins (RAMPs). RAMPs are type I transmembrane proteins with an extracellular N terminus and a cytoplasmic C terminus. RAMPs are required to transport calcitonin-receptor-like receptor (CRLR) to the plasma membrane. CRLR, a receptor with seven transmembrane domains, can function as either a calcitonin-gene-related peptide (CGRP) receptor or an adrenomedullin receptor, depending on which members of the RAMP family are expressed. In the presence of this (RAMP1) protein, CRLR functions as a CGRP receptor. The RAMP1 protein is involved in the terminal glycosylation, maturation, and presentation of the CGRP receptor to the cell surface. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| ENSG00000182253 | 23336 | synemin | SYNM | The protein encoded by this gene is an intermediate filament (IF) family member. IF proteins are cytoskeletal proteins that confer resistance to mechanical stress and are encoded by a dispersed multigene family. This protein has been found to form a linkage between desmin, which is a subunit of the IF network, and the extracellular matrix, and provides an important structural support in muscle. Two alternatively spliced variants encoding different isoforms have been described for this gene. | NA |
| ENSG00000112208 | 9532 | BCL2 associated athanogene 2 | BAG2 | BAG proteins compete with Hip for binding to the Hsc70/Hsp70 ATPase domain and promote substrate release. All the BAG proteins have an approximately 45-amino acid BAG domain near the C terminus but differ markedly in their N-terminal regions. The predicted BAG2 protein contains 211 amino acids. The BAG domains of BAG1, BAG2, and BAG3 interact specifically with the Hsc70 ATPase domain in vitro and in mammalian cells. All 3 proteins bind with high affinity to the ATPase domain of Hsc70 and inhibit its chaperone activity in a Hip-repressible manner. | NA |
| ENSG00000255959 | ENSG00000255959 | NA | RP11-804A23.2 | NA | NA |
| ENSG00000131462 | 7283 | tubulin gamma 1 | TUBG1 | This gene encodes a member of the tubulin superfamily. The encoded protein localizes to the centrosome where it binds to microtubules as part of a complex referred to as the gamma-tubulin ring complex. The protein mediates microtubule nucleation and is required for microtubule formation and progression of the cell cycle. A pseudogene of this gene is found on chromosome 7. | NA |
| ENSG00000173641 | 27129 | heat shock protein family B (small) member 7 | HSPB7 | NA | NA |
| ENSG00000090339 | 3383 | intercellular adhesion molecule 1 | ICAM1 | This gene encodes a cell surface glycoprotein which is typically expressed on endothelial cells and cells of the immune system. It binds to integrins of type CD11a / CD18, or CD11b / CD18 and is also exploited by Rhinovirus as a receptor. | NA |
| ENSG00000198848 | 1066 | carboxylesterase 1 | CES1 | This gene encodes a member of the carboxylesterase large family. The family members are responsible for the hydrolysis or transesterification of various xenobiotics, such as cocaine and heroin, and endogenous substrates with ester, thioester, or amide bonds. They may participate in fatty acyl and cholesterol ester metabolism, and may play a role in the blood-brain barrier system. This enzyme is the major liver enzyme and functions in liver drug clearance. Mutations of this gene cause carboxylesterase 1 deficiency. Three transcript variants encoding three different isoforms have been found for this gene. | NA |
| ENSG00000268603 | ENSG00000268603 | NA | RP11-316O14.1 | NA | NA |
| ENSG00000114200 | 590 | butyrylcholinesterase | BCHE | Mutant alleles at the BCHE locus are responsible for suxamethonium sensitivity. Homozygous persons sustain prolonged apnea after administration of the muscle relaxant suxamethonium in connection with surgical anesthesia. The activity of pseudocholinesterase in the serum is low and its substrate behavior is atypical. In the absence of the relaxant, the homozygote is at no known disadvantage. | NA |
| ENSG00000125148 | 4502 | metallothionein 2A | MT2A | NA | NA |
| ENSG00000205364 | 4499 | metallothionein 1M | MT1M | This gene encodes a member of the metallothionein superfamily, type 1 family. Metallothioneins have a high content of cysteine residues that bind various heavy metals. These genes are transcriptionally regulated by both heavy metals and glucocorticoids. | NA |
| ENSG00000137124 | 219 | aldehyde dehydrogenase 1 family member B1 | ALDH1B1 | This protein belongs to the aldehyde dehydrogenases family of proteins. Aldehyde dehydrogenase is the second enzyme of the major oxidative pathway of alcohol metabolism. This gene does not contain introns in the coding sequence. The variation of this locus may affect the development of alcohol-related problems. | NA |
| ENSG00000167034 | 4824 | NK3 homeobox 1 | NKX3-1 | This gene encodes a homeobox-containing transcription factor. This transcription factor functions as a negative regulator of epithelial cell growth in prostate tissue. Aberrant expression of this gene is associated with prostate tumor progression. Alternate splicing results in multiple transcript variants of this gene. | NA |
| ENSG00000257718 | ENSG00000257718 | NA | RP11-396F22.1 | NA | NA |
| ENSG00000108691 | 6347 | C-C motif chemokine ligand 2 | CCL2 | This gene is one of several cytokine genes clustered on the q-arm of chromosome 17. Chemokines are a superfamily of secreted proteins involved in immunoregulatory and inflammatory processes. The superfamily is divided into four subfamilies based on the arrangement of N-terminal cysteine residues of the mature peptide. This chemokine is a member of the CC subfamily which is characterized by two adjacent cysteine residues. This cytokine displays chemotactic activity for monocytes and basophils but not for neutrophils or eosinophils. It has been implicated in the pathogenesis of diseases characterized by monocytic infiltrates, like psoriasis, rheumatoid arthritis and atherosclerosis. It binds to chemokine receptors CCR2 and CCR4. | NA |
| ENSG00000074219 | 8463 | TEA domain transcription factor 2 | TEAD2 | NA | NA |
| ENSG00000101447 | 81610 | family with sequence similarity 83 member D | FAM83D | NA | NA |
| ENSG00000124762 | 1026 | cyclin-dependent kinase inhibitor 1A | CDKN1A | This gene encodes a potent cyclin-dependent kinase inhibitor. The encoded protein binds to and inhibits the activity of cyclin-cyclin-dependent kinase2 or -cyclin-dependent kinase4 complexes, and thus functions as a regulator of cell cycle progression at G1. The expression of this gene is tightly controlled by the tumor suppressor protein p53, through which this protein mediates the p53-dependent cell cycle G1 phase arrest in response to a variety of stress stimuli. This protein can interact with proliferating cell nuclear antigen, a DNA polymerase accessory factor, and plays a regulatory role in S phase DNA replication and DNA damage repair. This protein was reported to be specifically cleaved by CASP3-like caspases, which thus leads to a dramatic activation of cyclin-dependent kinase2, and may be instrumental in the execution of apoptosis following caspase activation. Mice that lack this gene have the ability to regenerate damaged or missing tissue. Multiple alternatively spliced variants have been found for this gene. | NA |
| ENSG00000168386 | 11259 | filamin A interacting protein 1 like | FILIP1L | NA | NA |
| ENSG00000118520 | 383 | arginase 1 | ARG1 | Arginase catalyzes the hydrolysis of arginine to ornithine and urea. At least two isoforms of mammalian arginase exist (types I and II) which differ in their tissue distribution, subcellular localization, immunologic crossreactivity and physiologic function. The type I isoform encoded by this gene, is a cytosolic enzyme and expressed predominantly in the liver as a component of the urea cycle. Inherited deficiency of this enzyme results in argininemia, an autosomal recessive disorder characterized by hyperammonemia. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| ENSG00000102802 | 84935 | mesenteric estrogen dependent adipogenesis | MEDAG | NA | NA |
| ENSG00000166025 | 154810 | angiomotin like 1 | AMOTL1 | The protein encoded by this gene is a peripheral membrane protein that is a component of tight junctions or TJs. TJs form an apical junctional structure and act to control paracellular permeability and maintain cell polarity. This protein is related to angiomotin, an angiostatin binding protein that regulates endothelial cell migration and capillary formation. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| ENSG00000154548 | 135295 | serine and arginine rich splicing factor 12 | SRSF12 | NA | NA |
| ENSG00000254902 | ENSG00000254902 | ANO1 antisense RNA 1 | ANO1-AS1 | NA | NA |
| ENSG00000214456 | 440503 | perilipin 5 | PLIN5 | Members of the perilipin family, such as PLIN5, coat intracellular lipid storage droplets and protect them from lipolytic degradation (Dalen et al., 2007 [PubMed 17234449]). | NA |
| ENSG00000258554 | ENSG00000258554 | NA | RP11-973D8.4 | NA | NA |
| ENSG00000130164 | 3949 | low density lipoprotein receptor | LDLR | The low density lipoprotein receptor (LDLR) gene family consists of cell surface proteins involved in receptor-mediated endocytosis of specific ligands. Low density lipoprotein (LDL) is normally bound at the cell membrane and taken into the cell ending up in lysosomes where the protein is degraded and the cholesterol is made available for repression of microsomal enzyme 3-hydroxy-3-methylglutaryl coenzyme A (HMG CoA) reductase, the rate-limiting step in cholesterol synthesis. At the same time, a reciprocal stimulation of cholesterol ester synthesis takes place. Mutations in this gene cause the autosomal dominant disorder, familial hypercholesterolemia. Alternate splicing results in multiple transcript variants. | NA |
| ENSG00000138709 | 55132 | La ribonucleoprotein domain family member 1B | LARP1B | This gene encodes a protein containing domains found in the La related protein of Drosophila melanogaster. La motif-containing proteins are thought to be RNA-binding proteins, where the La motif and adjacent amino acids fold into an RNA recognition motif. The La motif is also found in proteins unrelated to the La protein. Alternative splicing has been observed at this locus and multiple variants, encoding distinct isoforms, are described. Additional splice variation has been identified but the full-length nature of these transcripts has not been determined. | NA |
| ENSG00000183655 | 64410 | kelch like family member 25 | KLHL25 | NA | NA |
| ENSG00000272275 | ENSG00000272275 | NA | RP11-791G15.2 | NA | NA |
| ENSG00000134470 | 3601 | interleukin 15 receptor subunit alpha | IL15RA | This gene encodes a cytokine receptor that specifically binds interleukin 15 (IL15) with high affinity. The receptors of IL15 and IL2 share two subunits, IL2R beta and IL2R gamma. This forms the basis of many overlapping biological activities of IL15 and IL2. The protein encoded by this gene is structurally related to IL2R alpha, an additional IL2-specific alpha subunit necessary for high affinity IL2 binding. Unlike IL2RA, IL15RA is capable of binding IL15 with high affinity independent of other subunits, which suggests distinct roles between IL15 and IL2. This receptor is reported to enhance cell proliferation and expression of apoptosis inhibitor BCL2L1/BCL2-XL and BCL2. Multiple alternatively spliced transcript variants of this gene have been reported. | NA |
| ENSG00000178814 | 26873 | 5-oxoprolinase (ATP-hydrolysing) | OPLAH | The protein encoded by this gene acts as a homodimer, using ATP hydrolysis to catalyze the conversion of 5-oxo-L-proline to L-glutamate. Defects in this gene are a cause of 5-oxoprolinase deficiency (OPLAHD). | NA |
| ENSG00000232815 | ENSG00000232815 | double homeobox 4 like 50, pseudogene | DUX4L50 | NA | NA |
| ENSG00000135931 | 80210 | armadillo repeat containing 9 | ARMC9 | NA | NA |
| ENSG00000250899 | ENSG00000250899 | NA | RP11-253E3.3 | NA | NA |
| ENSG00000261616 | ENSG00000261616 | NA | RP11-6O2.3 | NA | NA |
| ENSG00000113916 | 604 | B-cell CLL/lymphoma 6 | BCL6 | The protein encoded by this gene is a zinc finger transcription factor and contains an N-terminal POZ domain. This protein acts as a sequence-specific repressor of transcription, and has been shown to modulate the transcription of STAT-dependent IL-4 responses of B cells. This protein can interact with a variety of POZ-containing proteins that function as transcription corepressors. This gene is found to be frequently translocated and hypermutated in diffuse large-cell lymphoma (DLCL), and may be involved in the pathogenesis of DLCL. Alternatively spliced transcript variants encoding different protein isoforms have been found for this gene. | NA |
| ENSG00000259799 | ENSG00000259799 | NA | RP11-554A11.9 | NA | NA |
| ENSG00000148840 | 23082 | peroxisome proliferator-activated receptor gamma, coactivator-related 1 | PPRC1 | The protein encoded by this gene is similar to PPAR-gamma coactivator 1 (PPARGC1/PGC-1), a protein that can activate mitochondrial biogenesis in part through a direct interaction with nuclear respiratory factor 1 (NRF1). This protein has been shown to interact with NRF1. It is thought to be a functional relative of PPAR-gamma coactivator 1 that activates mitochondrial biogenesis through NRF1 in response to proliferative signals. Alternative splicing results in multiple transcript variants. | NA |
| ENSG00000169583 | 9022 | chloride intracellular channel 3 | CLIC3 | Chloride channels are a diverse group of proteins that regulate fundamental cellular processes including stabilization of cell membrane potential, transepithelial transport, maintenance of intracellular pH, and regulation of cell volume. Chloride intracellular channel 3 is a member of the p64 family and is predominantly localized in the nucleus and stimulates chloride ion channel activity. In addition, this protein may participate in cellular growth control, based on its association with ERK7, a member of the MAP kinase family. | NA |
| ENSG00000267607 | ENSG00000267607 | NA | CTD-2369P2.8 | NA | NA |
| ENSG00000175505 | 23529 | cardiotrophin-like cytokine factor 1 | CLCF1 | This gene is a member of the glycoprotein (gp)130 cytokine family and encodes cardiotrophin-like cytokine factor 1 (CLCF1). CLCF1 forms a heterodimer complex with cytokine receptor-like factor 1 (CRLF1). This dimer competes with ciliary neurotrophic factor (CNTF) for binding to the ciliary neurotrophic factor receptor (CNTFR) complex, and activates the Jak-STAT signaling cascade. CLCF1 can be actively secreted from cells by forming a complex with soluble type I CRLF1 or soluble CNTFR. CLCF1 is a potent neurotrophic factor, B-cell stimulatory agent and neuroendocrine modulator of pituitary corticotroph function. Defects in CLCF1 cause cold-induced sweating syndrome 2 (CISS2). This syndrome is characterized by a profuse sweating after exposure to cold as well as congenital physical abnormalities of the head and spine. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | NA |
| ENSG00000181026 | 64782 | apoptosis enhancing nuclease | AEN | NA | NA |
| ENSG00000159388 | 7832 | BTG family member 2 | BTG2 | The protein encoded by this gene is a member of the BTG/Tob family. This family has structurally related proteins that appear to have antiproliferative properties. This encoded protein is involved in the regulation of the G1/S transition of the cell cycle. | NA |
| ENSG00000187193 | 4501 | metallothionein 1X | MT1X | NA | NA |
| ENSG00000162878 | 91461 | protein kinase domain containing, cytoplasmic | PKDCC | NA | NA |
| ENSG00000259827 | ENSG00000259827 | NA | RP11-343H19.2 | NA | NA |
| ENSG00000065150 | 3843 | importin 5 | IPO5 | Nucleocytoplasmic transport, a signal- and energy-dependent process, takes place through nuclear pore complexes embedded in the nuclear envelope. The import of proteins containing a nuclear localization signal (NLS) requires the NLS import receptor, a heterodimer of importin alpha and beta subunits also known as karyopherins. Importin alpha binds the NLS-containing cargo in the cytoplasm and importin beta docks the complex at the cytoplasmic side of the nuclear pore complex. In the presence of nucleoside triphosphates and the small GTP binding protein Ran, the complex moves into the nuclear pore complex and the importin subunits dissociate. Importin alpha enters the nucleoplasm with its passenger protein and importin beta remains at the pore. Interactions between importin beta and the FG repeats of nucleoporins are essential in translocation through the pore complex. The protein encoded by this gene is a member of the importin beta family. | NA |
| ENSG00000123374 | 1017 | cyclin-dependent kinase 2 | CDK2 | This gene encodes a member of a family of serine/threonine protein kinases that participate in cell cycle regulation. The encoded protein is the catalytic subunit of the cyclin-dependent protein kinase complex, which regulates progression through the cell cycle. Activity of this protein is especially critical during the G1 to S phase transition. This protein associates with and regulated by other subunits of the complex including cyclin A or E, CDK inhibitor p21Cip1 (CDKN1A), and p27Kip1 (CDKN1B). Alternative splicing results in multiple transcript variants. | NA |
| ENSG00000170890 | 5319 | phospholipase A2 group IB | PLA2G1B | This gene encodes a secreted member of the phospholipase A2 (PLA2) class of enzymes, which is produced by the pancreatic acinar cells. The encoded calcium-dependent enzyme catalyzes the hydrolysis of the sn-2 position of membrane glycerophospholipids to release arachidonic acid (AA) and lysophospholipids. AA is subsequently converted by downstream metabolic enzymes to several bioactive lipophilic compounds (eicosanoids), including prostaglandins (PGs) and leukotrienes (LTs). The enzyme may be involved in several physiological processes including cell contraction, cell proliferation and pathological response. | NA |
| ENSG00000165995 | 783 | calcium voltage-gated channel auxiliary subunit beta 2 | CACNB2 | This gene encodes a subunit of a voltage-dependent calcium channel protein that is a member of the voltage-gated calcium channel superfamily. The gene product was originally identified as an antigen target in Lambert-Eaton myasthenic syndrome, an autoimmune disorder. Mutations in this gene are associated with Brugada syndrome. Alternatively spliced variants encoding different isoforms have been described. | NA |
| ENSG00000250471 | ENSG00000250471 | guanine monophosphate synthase pseudogene 1 | GMPSP1 | NA | NA |
| ENSG00000090621 | 8761 | poly(A) binding protein cytoplasmic 4 | PABPC4 | Poly(A)-binding proteins (PABPs) bind to the poly(A) tail present at the 3-prime ends of most eukaryotic mRNAs. PABPC4 or IPABP (inducible PABP) was isolated as an activation-induced T-cell mRNA encoding a protein. Activation of T cells increased PABPC4 mRNA levels in T cells approximately 5-fold. PABPC4 contains 4 RNA-binding domains and proline-rich C terminus. PABPC4 is localized primarily to the cytoplasm. It is suggested that PABPC4 might be necessary for regulation of stability of labile mRNA species in activated T cells. PABPC4 was also identified as an antigen, APP1 (activated-platelet protein-1), expressed on thrombin-activated rabbit platelets. PABPC4 may also be involved in the regulation of protein translation in platelets and megakaryocytes or may participate in the binding or stabilization of polyadenylates in platelet dense granules. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| ENSG00000164023 | 166929 | sphingomyelin synthase 2 | SGMS2 | Sphingomyelin, a major component of cell and Golgi membranes, is made by the transfer of phosphocholine from phosphatidylcholine onto ceramide, with diacylglycerol as a side product. The protein encoded by this gene is an enzyme that catalyzes this reaction primarily at the cell membrane. The synthesis is reversible, and this enzyme can catalyze the reaction in either direction. The encoded protein is required for cell growth. Three transcript variants encoding the same protein have been found for this gene. There is evidence for more variants, but the full-length nature of their transcripts has not been determined. | NA |
| ENSG00000101187 | 28231 | solute carrier organic anion transporter family member 4A1 | SLCO4A1 | NA | NA |
| ENSG00000178531 | 404217 | cortexin 1 | CTXN1 | NA | NA |
| ENSG00000109610 | 6649 | superoxide dismutase 3, extracellular | SOD3 | This gene encodes a member of the superoxide dismutase (SOD) protein family. SODs are antioxidant enzymes that catalyze the conversion of superoxide radicals into hydrogen peroxide and oxygen, which may protect the brain, lungs, and other tissues from oxidative stress. Proteolytic processing of the encoded protein results in the formation of two distinct homotetramers that differ in their ability to interact with the extracellular matrix (ECM). Homotetramers consisting of the intact protein, or type C subunit, exhibit high affinity for heparin and are anchored to the ECM. Homotetramers consisting of a proteolytically cleaved form of the protein, or type A subunit, exhibit low affinity for heparin and do not interact with the ECM. A mutation in this gene may be associated with increased heart disease risk. | NA |
| ENSG00000159208 | 148523 | circadian associated repressor of transcription | CIART | NA | NA |
| ENSG00000224376 | ENSG00000224376 | NA | AC017104.6 | NA | NA |
| ENSG00000122863 | 9469 | carbohydrate sulfotransferase 3 | CHST3 | This gene encodes an enzyme which catalyzes the sulfation of chondroitin, a proteoglycan found in the extracellular matrix and most cells which is involved in cell migration and differentiation. Mutations in this gene are associated with spondylepiphyseal dysplasia and humerospinal dysostosis. | NA |
| ENSG00000181458 | 55076 | transmembrane protein 45A | TMEM45A | NA | NA |
| ENSG00000115604 | 8809 | interleukin 18 receptor 1 | IL18R1 | The protein encoded by this gene is a cytokine receptor that belongs to the interleukin 1 receptor family. This receptor specifically binds interleukin 18 (IL18), and is essential for IL18 mediated signal transduction. IFN-alpha and IL12 are reported to induce the expression of this receptor in NK and T cells. This gene along with four other members of the interleukin 1 receptor family, including IL1R2, IL1R1, ILRL2 (IL-1Rrp2), and IL1RL1 (T1/ST2), form a gene cluster on chromosome 2q. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| ENSG00000176014 | 84617 | tubulin beta 6 class V | TUBB6 | NA | NA |
| ENSG00000137193 | 5292 | Pim-1 proto-oncogene, serine/threonine kinase | PIM1 | The protein encoded by this gene belongs to the Ser/Thr protein kinase family, and PIM subfamily. This gene is expressed primarily in B-lymphoid and myeloid cell lines, and is overexpressed in hematopoietic malignancies and in prostate cancer. It plays a role in signal transduction in blood cells, contributing to both cell proliferation and survival, and thus provides a selective advantage in tumorigenesis. Both the human and orthologous mouse genes have been reported to encode two isoforms (with preferential cellular localization) resulting from the use of alternative in-frame translation initiation codons, the upstream non-AUG (CUG) and downstream AUG codons (PMIDs:16186805, 1825810). | NA |
| ENSG00000117479 | 10560 | solute carrier family 19 member 2 | SLC19A2 | This gene encodes the thiamin transporter protein. Mutations in this gene cause thiamin-responsive megaloblastic anemia syndrome (TRMA), which is an autosomal recessive disorder characterized by diabetes mellitus, megaloblastic anemia and sensorineural deafness. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| ENSG00000237989 | 101928399 | uncharacterized LOC101928399 | TCONS_00029157 | NA | NA |
| ENSG00000141076 | 84916 | UTP4, small subunit processome component | UTP4 | This gene encodes a WD40-repeat-containing protein that is localized to the nucleolus. Mutation of this gene causes North American Indian childhood cirrhosis, a severe intrahepatic cholestasis that results in transient neonatal jaundice, and progresses to periportal fibrosis and cirrhosis in childhood and adolescence. Alternative splicing results in multiple transcript variants. | NA |
| ENSG00000110218 | 24145 | pannexin 1 | PANX1 | The protein encoded by this gene belongs to the innexin family. Innexin family members are the structural components of gap junctions. This protein and pannexin 2 are abundantly expressed in central nerve system (CNS) and are coexpressed in various neuronal populations. Studies in Xenopus oocytes suggest that this protein alone and in combination with pannexin 2 may form cell type-specific gap junctions with distinct properties. | NA |
| ENSG00000115758 | 4953 | ornithine decarboxylase 1 | ODC1 | This gene encodes the rate-limiting enzyme of the polyamine biosynthesis pathway which catalyzes ornithine to putrescine. The activity level for the enzyme varies in response to growth-promoting stimuli and exhibits a high turnover rate in comparison to other mammalian proteins. Originally localized to both chromosomes 2 and 7, the gene encoding this enzyme has been determined to be located on 2p25, with a pseudogene located on 7q31-qter. Multiple alternatively spliced transcript variants encoding distinct isoforms have been identified. | NA |
| ENSG00000173530 | 8793 | tumor necrosis factor receptor superfamily member 10d | TNFRSF10D | The protein encoded by this gene is a member of the TNF-receptor superfamily. This receptor contains an extracellular TRAIL-binding domain, a transmembrane domain, and a truncated cytoplamic death domain. This receptor does not induce apoptosis, and has been shown to play an inhibitory role in TRAIL-induced cell apoptosis. | NA |
| ENSG00000204291 | 1306 | collagen type XV alpha 1 chain | COL15A1 | This gene encodes the alpha chain of type XV collagen, a member of the FACIT collagen family (fibril-associated collagens with interrupted helices). Type XV collagen has a wide tissue distribution but the strongest expression is localized to basement membrane zones so it may function to adhere basement membranes to underlying connective tissue stroma. The proteolytically produced C-terminal fragment of type XV collagen is restin, a potentially antiangiogenic protein that is closely related to endostatin. Mouse studies have shown that collagen XV deficiency is associated with muscle and microvessel deterioration. | NA |
| ENSG00000157110 | 11030 | RNA binding protein with multiple splicing | RBPMS | This gene encodes a member of the RNA recognition motif family of RNA-binding proteins. The RNA recognition motif is between 80-100 amino acids in length and family members contain one to four copies of the motif. The RNA recognition motif consists of two short stretches of conserved sequence, as well as a few highly conserved hydrophobic residues. The encoded protein has a single, putative RNA recognition motif in its N-terminus. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| ENSG00000112245 | 7803 | protein tyrosine phosphatase type IVA, member 1 | PTP4A1 | This gene encodes a member of a small class of prenylated protein tyrosine phosphatases (PTPs), which contain a PTP domain and a characteristic C-terminal prenylation motif. The encoded protein is a cell signaling molecule that plays regulatory roles in a variety of cellular processes, including cell proliferation and migration. The protein may also be involved in cancer development and metastasis. This tyrosine phosphatase is a nuclear protein, but may associate with plasma membrane by means of its prenylation motif. Pseudogenes related to this gene are located on chromosomes 1, 2, 5, 7, 11 and X. | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",1,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[2,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | X_id | query | name | summary | notfound |
|---|---|---|---|---|---|
| CLPS | 1208 | ENSG00000137392 | colipase | The protein encoded by this gene is a cofactor needed by pancreatic lipase for efficient dietary lipid hydrolysis. It binds to the C-terminal, non-catalytic domain of lipase, thereby stabilizing an active conformation and considerably increasing the overall hydrophobic binding site. The gene product allows lipase to anchor noncovalently to the surface of lipid micelles, counteracting the destabilizing influence of intestinal bile salts. This cofactor is only expressed in pancreatic acinar cells, suggesting regulation of expression by tissue-specific elements. Three transcript variants encoding different isoforms have been found for this gene. | NA |
| CELA2A | 63036 | ENSG00000142615 | chymotrypsin like elastase family member 2A | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Like most of the human elastases, elastase 2A is secreted from the pancreas as a zymogen. In other species, elastase 2A has been shown to preferentially cleave proteins after leucine, methionine, and phenylalanine residues. | NA |
| REG1B | 5968 | ENSG00000172023 | regenerating family member 1 beta | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV based on the primary structures of the encoded proteins. This gene encodes a protein secreted by the exocrine pancreas that is highly similar to the REG1A protein. The related REG1A protein is associated with islet cell regeneration and diabetogenesis, and may be involved in pancreatic lithogenesis. Reg family members REG1A, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | NA |
| CTRC | 11330 | ENSG00000162438 | chymotrypsin C | This gene encodes a member of the peptidase S1 family. The encoded protein is a serum calcium-decreasing factor that has chymotrypsin-like protease activity. Alternatively spliced transcript variants have been observed, but their full-length nature has not been determined. | NA |
| SYCN | 342898 | ENSG00000179751 | syncollin | NA | NA |
| CTRB2 | 440387 | ENSG00000168928 | chymotrypsinogen B2 | NA | NA |
| PNLIPRP1 | 5407 | ENSG00000187021 | pancreatic lipase related protein 1 | NA | NA |
| CELA3B | 23436 | ENSG00000219073 | chymotrypsin like elastase family member 3B | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3B has little elastolytic activity. Like most of the human elastases, elastase 3B is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3B preferentially cleaves proteins after alanine residues. Elastase 3B may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1, and excretion of this protein in fecal material is frequently used as a measure of pancreatic function in clinical assays. | NA |
| PNLIP | 5406 | ENSG00000175535 | pancreatic lipase | This gene is a member of the lipase gene family. It encodes a carboxyl esterase that hydrolyzes insoluble, emulsified triglycerides, and is essential for the efficient digestion of dietary fats. This gene is expressed specifically in the pancreas. | NA |
| NA | NA | ENSG00000250606 | NA | NA | TRUE |
| RP11-331F4.4 | ENSG00000240338 | ENSG00000240338 | NA | NA | NA |
| CELA3A | 10136 | ENSG00000142789 | chymotrypsin like elastase family member 3A | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. | NA |
| NA | NA | ENSG00000165862 | NA | NA | TRUE |
| CPB1 | 1360 | ENSG00000153002 | carboxypeptidase B1 | Three different procarboxypeptidases A and two different procarboxypeptidases B have been isolated. The B1 and B2 forms differ from each other mainly in isoelectric point. Carboxypeptidase B1 is a highly tissue-specific protein and is a useful serum marker for acute pancreatitis and dysfunction of pancreatic transplants. It is not elevated in pancreatic carcinoma. | NA |
| CTRB1 | 1504 | ENSG00000168925 | chymotrypsinogen B1 | The protein encoded by this gene is one of a family of serine proteases that is secreted into the gastrointestinal tract as an inactive precursor, which is activated by proteolytic cleavage with trypsin. | NA |
| REG3A | 5068 | ENSG00000172016 | regenerating family member 3 alpha | This gene encodes a pancreatic secretory protein that may be involved in cell proliferation or differentiation. It has similarity to the C-type lectin superfamily. The enhanced expression of this gene is observed during pancreatic inflammation and liver carcinogenesis. The mature protein also functions as an antimicrobial protein with antibacterial activity. Alternate splicing results in multiple transcript variants that encode the same protein. | NA |
| AMY2A | 279 | ENSG00000243480 | amylase, alpha 2A (pancreatic) | This gene encodes a member of the alpha-amylase family of proteins. Amylases are secreted proteins that hydrolyze 1,4-alpha-glucoside bonds in oligosaccharides and polysaccharides, catalyzing the first step in digestion of dietary starch and glycogen. This gene and several family members are present in a gene cluster on chromosome 1. This gene encodes an amylase isoenzyme produced by the pancreas. | NA |
| CPA1 | 1357 | ENSG00000091704 | carboxypeptidase A1 | This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. | NA |
| PRSS1 | 5644 | ENSG00000204983 | protease, serine 1 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | NA |
| CELA2B | 51032 | ENSG00000215704 | chymotrypsin like elastase family member 2B | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Like most of the human elastases, elastase 2B is secreted from the pancreas as a zymogen. In other species, elastase 2B has been shown to preferentially cleave proteins after leucine, methionine, and phenylalanine residues. | NA |
| GP2 | 2813 | ENSG00000169347 | glycoprotein 2 | This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. | NA |
| PLA2G1B | 5319 | ENSG00000170890 | phospholipase A2 group IB | This gene encodes a secreted member of the phospholipase A2 (PLA2) class of enzymes, which is produced by the pancreatic acinar cells. The encoded calcium-dependent enzyme catalyzes the hydrolysis of the sn-2 position of membrane glycerophospholipids to release arachidonic acid (AA) and lysophospholipids. AA is subsequently converted by downstream metabolic enzymes to several bioactive lipophilic compounds (eicosanoids), including prostaglandins (PGs) and leukotrienes (LTs). The enzyme may be involved in several physiological processes including cell contraction, cell proliferation and pathological response. | NA |
| AMBP | 259 | ENSG00000106927 | alpha-1-microglobulin/bikunin precursor | This gene encodes a complex glycoprotein secreted in plasma. The precursor is proteolytically processed into distinct functioning proteins: alpha-1-microglobulin, which belongs to the superfamily of lipocalin transport proteins and may play a role in the regulation of inflammatory processes, and bikunin, which is a urinary trypsin inhibitor belonging to the superfamily of Kunitz-type protease inhibitors and plays an important role in many physiological and pathological processes. This gene is located on chromosome 9 in a cluster of lipocalin genes. | NA |
| CD44 | 960 | ENSG00000026508 | CD44 molecule (Indian blood group) | The protein encoded by this gene is a cell-surface glycoprotein involved in cell-cell interactions, cell adhesion and migration. It is a receptor for hyaluronic acid (HA) and can also interact with other ligands, such as osteopontin, collagens, and matrix metalloproteinases (MMPs). This protein participates in a wide variety of cellular functions including lymphocyte activation, recirculation and homing, hematopoiesis, and tumor metastasis. Transcripts for this gene undergo complex alternative splicing that results in many functionally distinct isoforms, however, the full length nature of some of these variants has not been determined. Alternative splicing is the basis for the structural and functional diversity of this protein, and may be related to tumor metastasis. | NA |
| MT1G | 4495 | ENSG00000125144 | metallothionein 1G | NA | NA |
| ADIRF-AS1 | ENSG00000272734 | ENSG00000272734 | ADIRF antisense RNA 1 | NA | NA |
| GDF15 | 9518 | ENSG00000130513 | growth differentiation factor 15 | The protein encoded by this gene belongs to the transforming growth factor-beta (TGF-beta) family. The protein is expressed in a broad range of cell types, acts as a pleiotropic cytokine and is involved in the stress reponse program of cells after cellular injury. Increased protein levels are associated with disease states such as tissue hypoxia, inflammation, acute injury and oxidative stress. | NA |
| CPA2 | 1358 | ENSG00000158516 | carboxypeptidase A2 | Three different forms of human pancreatic procarboxypeptidase A have been isolated. The encoded protein represents the A2 form, which is a monomeric protein with different biochemical properties from the A1 and A3 forms. The A2 form of pancreatic procarboxypeptidase acts on aromatic C-terminal residues and is a secreted protein. | NA |
| RP1-68D18.4 | ENSG00000255443 | ENSG00000255443 | NA | NA | NA |
| ARHGEF28 | 64283 | ENSG00000214944 | Rho guanine nucleotide exchange factor 28 | This gene encodes a member of the Rho guanine nucleotide exchange factor family. The encoded protein interacts with low molecular weight neurofilament mRNA and may be involved in the formation of amyotrophic lateral sclerosis neurofilament aggregates. Alternate splicing results in multiple transcript variants. | NA |
| TMEM52 | 339456 | ENSG00000178821 | transmembrane protein 52 | NA | NA |
| RP11-862L9.3 | ENSG00000266844 | ENSG00000266844 | NA | NA | NA |
| FAM174B | 400451 | ENSG00000185442 | family with sequence similarity 174 member B | NA | NA |
| ALB | 213 | ENSG00000163631 | albumin | Albumin is a soluble, monomeric protein which comprises about one-half of the blood serum protein. Albumin functions primarily as a carrier protein for steroids, fatty acids, and thyroid hormones and plays a role in stabilizing extracellular fluid volume. Albumin is a globular unglycosylated serum protein of molecular weight 65,000. Albumin is synthesized in the liver as preproalbumin which has an N-terminal peptide that is removed before the nascent protein is released from the rough endoplasmic reticulum. The product, proalbumin, is in turn cleaved in the Golgi vesicles to produce the secreted albumin. | NA |
| GLIS3 | 169792 | ENSG00000107249 | GLIS family zinc finger 3 | This gene is a member of the GLI-similar zinc finger protein family and encodes a nuclear protein with five C2H2-type zinc finger domains. This protein functions as both a repressor and activator of transcription and is specifically involved in the development of pancreatic beta cells, the thyroid, eye, liver and kidney. Mutations in this gene have been associated with neonatal diabetes and congenital hypothyroidism (NDH). Alternatively spliced variants that encode different protein isoforms have been described but the full-length nature of only two have been determined. | NA |
| DLK1 | 8788 | ENSG00000185559 | delta like non-canonical Notch ligand 1 | This gene encodes a transmembrane protein that contains multiple epidermal growth factor repeats that functions as a regulator of cell growth. The encoded protein is involved in the differentiation of several cell types including adipocytes. This gene is located in a region of chromosome 14 frequently showing unparental disomy, and is imprinted and expressed from the paternal allele. A single nucleotide variant in this gene is associated with child and adolescent obesity and shows polar overdominance, where heterozygotes carrying an active paternal allele express the phenotype, while mutant homozygotes are normal. | NA |
| PDGFD | 80310 | ENSG00000170962 | platelet derived growth factor D | The protein encoded by this gene is a member of the platelet-derived growth factor family. The four members of this family are mitogenic factors for cells of mesenchymal origin and are characterized by a core motif of eight cysteines, seven of which are found in this factor. This gene product only forms homodimers and, therefore, does not dimerize with the other three family members. It differs from alpha and beta members of this family in having an unusual N-terminal domain, the CUB domain. Two splice variants have been identified for this gene. | NA |
| XBP1 | 7494 | ENSG00000100219 | X-box binding protein 1 | This gene encodes a transcription factor that regulates MHC class II genes by binding to a promoter element referred to as an X box. This gene product is a bZIP protein, which was also identified as a cellular transcription factor that binds to an enhancer in the promoter of the T cell leukemia virus type 1 promoter. It may increase expression of viral proteins by acting as the DNA binding partner of a viral transactivator. It has been found that upon accumulation of unfolded proteins in the endoplasmic reticulum (ER), the mRNA of this gene is processed to an active form by an unconventional splicing mechanism that is mediated by the endonuclease inositol-requiring enzyme 1 (IRE1). The resulting loss of 26 nt from the spliced mRNA causes a frame-shift and an isoform XBP1(S), which is the functionally active transcription factor. The isoform encoded by the unspliced mRNA, XBP1(U), is constitutively expressed, and thought to function as a negative feedback regulator of XBP1(S), which shuts off transcription of target genes during the recovery phase of ER stress. A pseudogene of XBP1 has been identified and localized to chromosome 5. | NA |
| LOC100506314 | 100506314 | ENSG00000247498 | uncharacterized LOC100506314 | NA | NA |
| NUPR1 | 26471 | ENSG00000176046 | nuclear protein 1, transcriptional regulator | NA | NA |
| ARG2 | 384 | ENSG00000081181 | arginase 2 | Arginase catalyzes the hydrolysis of arginine to ornithine and urea. At least two isoforms of mammalian arginase exists (types I and II) which differ in their tissue distribution, subcellular localization, immunologic crossreactivity and physiologic function. The type II isoform encoded by this gene, is located in the mitochondria and expressed in extra-hepatic tissues, especially kidney. The physiologic role of this isoform is poorly understood; it is thought to play a role in nitric oxide and polyamine metabolism. Transcript variants of the type II gene resulting from the use of alternative polyadenylation sites have been described. | NA |
| SNHG25 | ENSG00000266402 | ENSG00000266402 | small nucleolar RNA host gene 25 | NA | NA |
| CYP27B1 | 1594 | ENSG00000111012 | cytochrome P450 family 27 subfamily B member 1 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. The protein encoded by this gene localizes to the inner mitochondrial membrane where it hydroxylates 25-hydroxyvitamin D3 at the 1alpha position. This reaction synthesizes 1alpha,25-dihydroxyvitamin D3, the active form of vitamin D3, which binds to the vitamin D receptor and regulates calcium metabolism. Thus this enzyme regulates the level of biologically active vitamin D and plays an important role in calcium homeostasis. Mutations in this gene can result in vitamin D-dependent rickets type I. | NA |
| EIF4EBP1 | 1978 | ENSG00000187840 | eukaryotic translation initiation factor 4E binding protein 1 | This gene encodes one member of a family of translation repressor proteins. The protein directly interacts with eukaryotic translation initiation factor 4E (eIF4E), which is a limiting component of the multisubunit complex that recruits 40S ribosomal subunits to the 5’ end of mRNAs. Interaction of this protein with eIF4E inhibits complex assembly and represses translation. This protein is phosphorylated in response to various signals including UV irradiation and insulin signaling, resulting in its dissociation from eIF4E and activation of mRNA translation. | NA |
| TTR | 7276 | ENSG00000118271 | transthyretin | This gene encodes transthyretin, one of the three prealbumins including alpha-1-antitrypsin, transthyretin and orosomucoid. Transthyretin is a carrier protein; it transports thyroid hormones in the plasma and cerebrospinal fluid, and also transports retinol (vitamin A) in the plasma. The protein consists of a tetramer of identical subunits. More than 80 different mutations in this gene have been reported; most mutations are related to amyloid deposition, affecting predominantly peripheral nerve and/or the heart, and a small portion of the gene mutations is non-amyloidogenic. The diseases caused by mutations include amyloidotic polyneuropathy, euthyroid hyperthyroxinaemia, amyloidotic vitreous opacities, cardiomyopathy, oculoleptomeningeal amyloidosis, meningocerebrovascular amyloidosis, carpal tunnel syndrome, etc. | NA |
| TMC4 | 147798 | ENSG00000167608 | transmembrane channel like 4 | NA | NA |
| REG1A | 5967 | ENSG00000115386 | regenerating family member 1 alpha | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | NA |
| SDC1 | 6382 | ENSG00000115884 | syndecan 1 | The protein encoded by this gene is a transmembrane (type I) heparan sulfate proteoglycan and is a member of the syndecan proteoglycan family. The syndecans mediate cell binding, cell signaling, and cytoskeletal organization and syndecan receptors are required for internalization of the HIV-1 tat protein. The syndecan-1 protein functions as an integral membrane protein and participates in cell proliferation, cell migration and cell-matrix interactions via its receptor for extracellular matrix proteins. Altered syndecan-1 expression has been detected in several different tumor types. While several transcript variants may exist for this gene, the full-length natures of only two have been described to date. These two represent the major variants of this gene and encode the same protein. | NA |
| PIGHP1 | ENSG00000259657 | ENSG00000259657 | phosphatidylinositol glycan anchor biosynthesis class H pseudogene 1 | NA | NA |
| GNMT | 27232 | ENSG00000124713 | glycine N-methyltransferase | The protein encoded by this gene is an enzyme that catalyzes the conversion of S-adenosyl-L-methionine (along with glycine) to S-adenosyl-L-homocysteine and sarcosine. This protein is found in the cytoplasm and acts as a homotetramer. Defects in this gene are a cause of GNMT deficiency (hypermethioninemia). Alternative splicing results in multiple transcript variants. Naturally occurring readthrough transcription occurs between the upstream CNPY3 (canopy FGF signaling regulator 3) gene and this gene and is represented with GeneID:107080644. | NA |
| TACSTD2 | 4070 | ENSG00000184292 | tumor-associated calcium signal transducer 2 | This intronless gene encodes a carcinoma-associated antigen. This antigen is a cell surface receptor that transduces calcium signals. Mutations of this gene have been associated with gelatinous drop-like corneal dystrophy. | NA |
| RP11-534L20.4 | ENSG00000234981 | ENSG00000234981 | NA | NA | NA |
| RP11-421L21.2 | ENSG00000235795 | ENSG00000235795 | NA | NA | NA |
| RP11-173B14.4 | ENSG00000228444 | ENSG00000228444 | NA | NA | NA |
| HHEX | 3087 | ENSG00000152804 | hematopoietically expressed homeobox | This gene encodes a member of the homeobox family of transcription factors, many of which are involved in developmental processes. Expression in specific hematopoietic lineages suggests that this protein may play a role in hematopoietic differentiation. | NA |
| HIGD1B | 51751 | ENSG00000131097 | HIG1 hypoxia inducible domain family member 1B | This gene encodes a member of the hypoxia inducible gene 1 (HIG1) domain family. The encoded protein is localized to the cell membrane and has been linked to tumorigenesis and the progression of pituitary adenomas. Alternative splicing results in multiple transcript variants. | NA |
| GLB1L2 | 89944 | ENSG00000149328 | galactosidase beta 1 like 2 | NA | NA |
| SPINK1 | 6690 | ENSG00000164266 | serine peptidase inhibitor, Kazal type 1 | The protein encoded by this gene is a trypsin inhibitor, which is secreted from pancreatic acinar cells into pancreatic juice. It is thought to function in the prevention of trypsin-catalyzed premature activation of zymogens within the pancreas and the pancreatic duct. Mutations in this gene are associated with hereditary pancreatitis and tropical calcific pancreatitis. | NA |
| NPM2 | 10361 | ENSG00000158806 | nucleophosmin/nucleoplasmin 2 | NA | NA |
| TFPI2 | 7980 | ENSG00000105825 | tissue factor pathway inhibitor 2 | This gene encodes a member of the Kunitz-type serine proteinase inhibitor family. The protein can inhibit a variety of serine proteases including factor VIIa/tissue factor, factor Xa, plasmin, trypsin, chymotryspin and plasma kallikrein. This gene has been identified as a tumor suppressor gene in several types of cancer. Alternative splicing results in multiple transcript variants. | NA |
| TNFRSF12A | 51330 | ENSG00000006327 | tumor necrosis factor receptor superfamily member 12A | NA | NA |
| SLC39A11 | 201266 | ENSG00000133195 | solute carrier family 39 member 11 | NA | NA |
| EEF1A1P9 | ENSG00000249264 | ENSG00000249264 | eukaryotic translation elongation factor 1 alpha 1 pseudogene 9 | NA | NA |
| VAMP8 | 8673 | ENSG00000118640 | vesicle associated membrane protein 8 | This gene encodes an integral membrane protein that belongs to the synaptobrevin/vesicle-associated membrane protein subfamily of soluble N-ethylmaleimide-sensitive factor attachment protein receptors (SNAREs). The encoded protein is involved in the fusion of synaptic vesicles with the presynaptic membrane. | NA |
| TRAF3IP2 | 10758 | ENSG00000056972 | TRAF3 interacting protein 2 | This gene encodes a protein involved in regulating responses to cytokines by members of the Rel/NF-kappaB transcription factor family. These factors play a central role in innate immunity in response to pathogens, inflammatory signals and stress. This gene product interacts with TRAF proteins (tumor necrosis factor receptor-associated factors) and either I-kappaB kinase or MAP kinase to activate either NF-kappaB or Jun kinase. Several alternative transcripts encoding different isoforms have been identified. Another transcript, which does not encode a protein and is transcribed in the opposite orientation, has been identified. Overexpression of this transcript has been shown to reduce expression of at least one of the protein encoding transcripts, suggesting it has a regulatory role in the expression of this gene. | NA |
| ZNF215 | 7762 | ENSG00000149054 | zinc finger protein 215 | NA | NA |
| THBS4 | 7060 | ENSG00000113296 | thrombospondin 4 | The protein encoded by this gene belongs to the thrombospondin protein family. Thrombospondin family members are adhesive glycoproteins that mediate cell-to-cell and cell-to-matrix interactions. This protein forms a pentamer and can bind to heparin and calcium. It is involved in local signaling in the developing and adult nervous system, and it contributes to spinal sensitization and neuropathic pain states. This gene is activated during the stromal response to invasive breast cancer. It may also play a role in inflammatory responses in Alzheimer’s disease. Alternative splicing results in multiple transcript variants. | NA |
| AEN | 64782 | ENSG00000181026 | apoptosis enhancing nuclease | NA | NA |
| RAB11FIP1 | 80223 | ENSG00000156675 | RAB11 family interacting protein 1 | This gene encodes one of the Rab11-family interacting proteins (Rab11-FIPs), which play a role in the Rab-11 mediated recycling of vesicles. The encoded protein may be involved in endocytic sorting, trafficking of proteins including integrin subunits and epidermal growth factor receptor (EGFR), and transport between the recycling endosome and the trans-Golgi network. Alternative splicing results in multiple transcript variants. A pseudogene is described on the X chromosome. | NA |
| SLC39A14 | 23516 | ENSG00000104635 | solute carrier family 39 member 14 | Zinc is an essential cofactor for hundreds of enzymes. It is involved in protein, nucleic acid, carbohydrate, and lipid metabolism, as well as in the control of gene transcription, growth, development, and differentiation. SLC39A14 belongs to a subfamily of proteins that show structural characteristics of zinc transporters (Taylor and Nicholson, 2003 [PubMed 12659941]). | NA |
| TCIRG1 | 10312 | ENSG00000110719 | T-cell immune regulator 1, ATPase H+ transporting V0 subunit a3 | Through alternate splicing, this gene encodes two proteins with similarity to subunits of the vacuolar ATPase (V-ATPase) but the encoded proteins seem to have different functions. V-ATPase is a multisubunit enzyme that mediates acidification of eukaryotic intracellular organelles. V-ATPase dependent organelle acidification is necessary for such intracellular processes as protein sorting, zymogen activation, and receptor-mediated endocytosis. V-ATPase is comprised of a cytosolic V1 domain and a transmembrane V0 domain. Mutations in this gene are associated with infantile malignant osteopetrosis. | NA |
| NA | NA | ENSG00000225410 | NA | NA | TRUE |
| G0S2 | 50486 | ENSG00000123689 | G0/G1 switch 2 | NA | NA |
| METTL1 | 4234 | ENSG00000037897 | methyltransferase like 1 | This gene is similar in sequence to the S. cerevisiae YDL201w gene. The gene product contains a conserved S-adenosylmethionine-binding motif and is inactivated by phosphorylation. Alternative splice variants encoding different protein isoforms have been described for this gene. A pseudogene has been identified on chromosome X. | NA |
| KIAA0922 | 23240 | ENSG00000121210 | KIAA0922 | NA | NA |
| ACTG2 | 72 | ENSG00000163017 | actin, gamma 2, smooth muscle, enteric | Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | NA |
| RP11-713M15.2 | ENSG00000272502 | ENSG00000272502 | NA | NA | NA |
| TRIM5 | 85363 | ENSG00000132256 | tripartite motif containing 5 | The protein encoded by this gene is a member of the tripartite motif (TRIM) family. The TRIM motif includes three zinc-binding domains, a RING, a B-box type 1 and a B-box type 2, and a coiled-coil region. The protein forms homo-oligomers via the coilel-coil region and localizes to cytoplasmic bodies. It appears to function as a E3 ubiquitin-ligase and ubiqutinates itself to regulate its subcellular localization. It may play a role in retroviral restriction. Multiple alternatively spliced transcript variants encoding different isoforms have been described for this gene. | NA |
| MATN1-AS1 | 100129196 | ENSG00000186056 | MATN1 antisense RNA 1 | NA | NA |
| ANPEP | 290 | ENSG00000166825 | alanyl aminopeptidase, membrane | Aminopeptidase N is located in the small-intestinal and renal microvillar membrane, and also in other plasma membranes. In the small intestine aminopeptidase N plays a role in the final digestion of peptides generated from hydrolysis of proteins by gastric and pancreatic proteases. Its function in proximal tubular epithelial cells and other cell types is less clear. The large extracellular carboxyterminal domain contains a pentapeptide consensus sequence characteristic of members of the zinc-binding metalloproteinase superfamily. Sequence comparisons with known enzymes of this class showed that CD13 and aminopeptidase N are identical. The latter enzyme was thought to be involved in the metabolism of regulatory peptides by diverse cell types, including small intestinal and renal tubular epithelial cells, macrophages, granulocytes, and synaptic membranes from the CNS. Human aminopeptidase N is a receptor for one strain of human coronavirus that is an important cause of upper respiratory tract infections. Defects in this gene appear to be a cause of various types of leukemia or lymphoma. | NA |
| NEAT1 | 283131 | ENSG00000245532 | nuclear paraspeckle assembly transcript 1 (non-protein coding) | This gene produces a long non-coding RNA (lncRNA) transcribed from the multiple endocrine neoplasia locus. This lncRNA is retained in the nucleus where it forms the core structural component of the paraspeckle sub-organelles. It may act as a transcriptional regulator for numerous genes, including some genes involved in cancer progression. | NA |
| RP5-1148A21.3 | ENSG00000266680 | ENSG00000266680 | NA | NA | NA |
| FAM134B | 54463 | ENSG00000154153 | family with sequence similarity 134 member B | The protein encoded by this gene is a cis-Golgi transmembrane protein that may be necessary for the long-term survival of nociceptive and autonomic ganglion neurons. Mutations in this gene are a cause of hereditary sensory and autonomic neuropathy type IIB (HSAN IIB), and this gene may also play a role in susceptibility to vascular dementia. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| C12orf45 | 121053 | ENSG00000151131 | chromosome 12 open reading frame 45 | NA | NA |
| SCARNA2 | 677766 | ENSG00000270066 | small Cajal body-specific RNA 2 | NA | NA |
| FARP1-AS1 | ENSG00000231194 | ENSG00000231194 | FARP1 antisense RNA 1 | NA | NA |
| TNFRSF19 | 55504 | ENSG00000127863 | tumor necrosis factor receptor superfamily member 19 | The protein encoded by this gene is a member of the TNF-receptor superfamily. This receptor is highly expressed during embryonic development. It has been shown to interact with TRAF family members, and to activate JNK signaling pathway when overexpressed in cells. This receptor is capable of inducing apoptosis by a caspase-independent mechanism, and it is thought to play an essential role in embryonic development. Alternatively spliced transcript variants encoding distinct isoforms have been described. | NA |
| ERRFI1 | 54206 | ENSG00000116285 | ERBB receptor feedback inhibitor 1 | ERRFI1 is a cytoplasmic protein whose expression is upregulated with cell growth (Wick et al., 1995 [PubMed 7641805]). It shares significant homology with the protein product of rat gene-33, which is induced during cell stress and mediates cell signaling (Makkinje et al., 2000 [PubMed 10749885]; Fiorentino et al., 2000 [PubMed 11003669]). | NA |
| MPP6 | 51678 | ENSG00000105926 | membrane palmitoylated protein 6 | Members of the peripheral membrane-associated guanylate kinase (MAGUK) family function in tumor suppression and receptor clustering by forming multiprotein complexes containing distinct sets of transmembrane, cytoskeletal, and cytoplasmic signaling proteins. All MAGUKs contain a PDZ-SH3-GUK core and are divided into 4 subfamilies, DLG-like (see DLG1; MIM 601014), ZO1-like (see TJP1; MIM 601009), p55-like (see MPP1; MIM 305360), and LIN2-like (see CASK; MIM 300172), based on their size and the presence of additional domains. MPP6 is a member of the p55-like MAGUK subfamily (Tseng et al., 2001 [PubMed 11311936]). | NA |
| CDC20P1 | ENSG00000231007 | ENSG00000231007 | cell division cycle 20 pseudogene 1 | NA | NA |
| EMID1 | 129080 | ENSG00000186998 | EMI domain containing 1 | NA | NA |
| ADGRG1 | 9289 | ENSG00000205336 | adhesion G protein-coupled receptor G1 | This gene encodes a member of the G protein-coupled receptor family and regulates brain cortical patterning. The encoded protein binds specifically to transglutaminase 2, a component of tissue and tumor stroma implicated as an inhibitor of tumor progression. Mutations in this gene are associated with a brain malformation known as bilateral frontoparietal polymicrogyria. Alternative splicing results in multiple transcript variants. | NA |
| MTHFD2 | 10797 | ENSG00000065911 | methylenetetrahydrofolate dehydrogenase (NADP+ dependent) 2, methenyltetrahydrofolate cyclohydrolase | This gene encodes a nuclear-encoded mitochondrial bifunctional enzyme with methylenetetrahydrofolate dehydrogenase and methenyltetrahydrofolate cyclohydrolase activities. The enzyme functions as a homodimer and is unique in its absolute requirement for magnesium and inorganic phosphate. Formation of the enzyme-magnesium complex allows binding of NAD. Alternative splicing results in two different transcripts, one protein-coding and the other not protein-coding. This gene has a pseudogene on chromosome 7. | NA |
| FGF18 | 8817 | ENSG00000156427 | fibroblast growth factor 18 | The protein encoded by this gene is a member of the fibroblast growth factor (FGF) family. FGF family members possess broad mitogenic and cell survival activities, and are involved in a variety of biological processes, including embryonic development, cell growth, morphogenesis, tissue repair, tumor growth, and invasion. It has been shown in vitro that this protein is able to induce neurite outgrowth in PC12 cells. Studies of the similar proteins in mouse and chick suggested that this protein is a pleiotropic growth factor that stimulates proliferation in a number of tissues, most notably the liver and small intestine. Knockout studies of the similar gene in mice implied the role of this protein in regulating proliferation and differentiation of midline cerebellar structures. | NA |
| ZNF321P | 399669 | ENSG00000213801 | zinc finger protein 321, pseudogene | NA | NA |
| CASP4 | 837 | ENSG00000196954 | caspase 4 | This gene encodes a protein that is a member of the cysteine-aspartic acid protease (caspase) family. Sequential activation of caspases plays a central role in the execution-phase of cell apoptosis. Caspases exist as inactive proenzymes composed of a prodomain and a large and small protease subunit. Activation of caspases requires proteolytic processing at conserved internal aspartic residues to generate a heterodimeric enzyme consisting of the large and small subunits. This caspase is able to cleave and activate its own precursor protein, as well as caspase 1 precursor. When overexpressed, this gene induces cell apoptosis. Alternative splicing results in transcript variants encoding distinct isoforms. | NA |
| CTC-301O7.4 | ENSG00000197813 | ENSG00000197813 | NA | NA | NA |
| CAPG | 822 | ENSG00000042493 | capping actin protein, gelsolin like | This gene encodes a member of the gelsolin/villin family of actin-regulatory proteins. The encoded protein reversibly blocks the barbed ends of F-actin filaments in a Ca2+ and phosphoinositide-regulated manner, but does not sever preformed actin filaments. By capping the barbed ends of actin filaments, the encoded protein contributes to the control of actin-based motility in non-muscle cells. Alternatively spliced transcript variants have been observed for this gene. | NA |
| TOM1L1 | 10040 | ENSG00000141198 | target of myb1 like 1 membrane trafficking protein | NA | NA |
| TMBIM1 | 64114 | ENSG00000135926 | transmembrane BAX inhibitor motif containing 1 | NA | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",2,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[3,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
kable(as.data.frame(out))
| name | X_id | summary | symbol | query |
|---|---|---|---|---|
| dual specificity phosphatase 4 | 1846 | The protein encoded by this gene is a member of the dual specificity protein phosphatase subfamily. These phosphatases inactivate their target kinases by dephosphorylating both the phosphoserine/threonine and phosphotyrosine residues. They negatively regulate members of the mitogen-activated protein (MAP) kinase superfamily (MAPK/ERK, SAPK/JNK, p38), which are associated with cellular proliferation and differentiation. Different members of the family of dual specificity phosphatases show distinct substrate specificities for various MAP kinases, different tissue distribution and subcellular localization, and different modes of inducibility of their expression by extracellular stimuli. This gene product inactivates ERK1, ERK2 and JNK, is expressed in a variety of tissues, and is localized in the nucleus. Two alternatively spliced transcript variants, encoding distinct isoforms, have been observed for this gene. In addition, multiple polyadenylation sites have been reported. | DUSP4 | ENSG00000120875 |
| UL16 binding protein 2 | 80328 | This gene encodes a major histocompatibility complex (MHC) class I-related molecule that binds to the NKG2D receptor on natural killer (NK) cells to trigger release of multiple cytokines and chemokines that in turn contribute to the recruitment and activation of NK cells. The encoded protein undergoes further processing to generate the mature protein that is either anchored to membrane via a glycosylphosphatidylinositol moiety, or secreted. Many malignant cells secrete the encoded protein to evade immunosurveillance by NK cells. This gene is located in a cluster of multiple MHC class I-related genes on chromosome 6. | ULBP2 | ENSG00000131015 |
| VLDLR antisense RNA 1 | 401491 | NA | VLDLR-AS1 | ENSG00000236404 |
| homer scaffolding protein 2 | 9455 | This gene encodes a member of the homer family of dendritic proteins. Members of this family regulate group 1 metabotrophic glutamate receptor function. The encoded protein is a postsynaptic density scaffolding protein. Alternative splicing results in multiple transcript variants. Two related pseudogenes have been identified on chromosome 14. | HOMER2 | ENSG00000103942 |
| podoplanin | 10630 | This gene encodes a type-I integral membrane glycoprotein with diverse distribution in human tissues. The physiological function of this protein may be related to its mucin-type character. The homologous protein in other species has been described as a differentiation antigen and influenza-virus receptor. The specific function of this protein has not been determined but it has been proposed as a marker of lung injury. Alternatively spliced transcript variants encoding different isoforms have been identified. | PDPN | ENSG00000162493 |
| glutamic pyruvate transaminase (alanine aminotransferase) 2 | 84706 | This gene encodes a mitochondrial alanine transaminase, a pyridoxal enzyme that catalyzes the reversible transamination between alanine and 2-oxoglutarate to generate pyruvate and glutamate. Alanine transaminases play roles in gluconeogenesis and amino acid metabolism in many tissues including skeletal muscle, kidney, and liver. Activating transcription factor 4 upregulates this gene under metabolic stress conditions in hepatocyte cell lines. A loss of function mutation in this gene has been associated with developmental encephalopathy. Alternative splicing results in multiple transcript variants. | GPT2 | ENSG00000166123 |
| dual specificity phosphatase 5 | 1847 | The protein encoded by this gene is a member of the dual specificity protein phosphatase subfamily. These phosphatases inactivate their target kinases by dephosphorylating both the phosphoserine/threonine and phosphotyrosine residues. They negatively regulate members of the mitogen-activated protein (MAP) kinase superfamily (MAPK/ERK, SAPK/JNK, p38), which are associated with cellular proliferation and differentiation. Different members of the family of dual specificity phosphatases show distinct substrate specificities for various MAP kinases, different tissue distribution and subcellular localization, and different modes of inducibility of their expression by extracellular stimuli. This gene product inactivates ERK1, is expressed in a variety of tissues with the highest levels in pancreas and brain, and is localized in the nucleus. | DUSP5 | ENSG00000138166 |
| ring finger protein 144A | 9781 | The protein encoded by this protein contains a RING finger, a motif known to be involved in protein-DNA and protein-protein interactions. The mouse counterpart of this protein has been shown to interact with Ube2l3/UbcM4, which is an ubiquitin-conjugating enzyme involved in embryonic development. | RNF144A | ENSG00000151692 |
| integrin subunit alpha 8 | 8516 | Integrins are heterodimeric transmembrane receptor proteins that mediate numerous cellular processes including cell adhesion, cytoskeletal rearrangement, and activation of cell signaling pathways. Integrins are composed of alpha and beta subunits. This gene encodes the alpha 8 subunit of the heterodimeric integrin alpha8beta1 protein. The encoded protein is a single-pass type 1 membrane protein that contains multiple FG-GAP repeats. This repeat is predicted to fold into a beta propeller structure. This gene regulates the recruitment of mesenchymal cells into epithelial structures, mediates cell-cell interactions, and regulates neurite outgrowth of sensory and motor neurons. The integrin alpha8beta1 protein thus plays an important role in wound-healing and organogenesis. Mutations in this gene have been associated with renal hypodysplasia/aplasia-1 (RHDA1) and with several animal models of chronic kidney disease. Alternate splicing results in multiple transcript variants encoding distinct isoforms. | ITGA8 | ENSG00000077943 |
| very low density lipoprotein receptor | 7436 | The low density lipoprotein receptor (LDLR) gene family consists of cell surface proteins involved in receptor-mediated endocytosis of specific ligands. This gene encodes a lipoprotein receptor that is a member of the LDLR family and plays important roles in VLDL-triglyceride metabolism and the reelin signaling pathway. Mutations in this gene cause VLDLR-associated cerebellar hypoplasia. Alternative splicing generates multiple transcript variants encoding distinct isoforms for this gene. | VLDLR | ENSG00000147852 |
| cholinergic receptor nicotinic epsilon subunit | 1145 | Acetylcholine receptors at mature mammalian neuromuscular junctions are pentameric protein complexes composed of four subunits in the ratio of two alpha subunits to one beta, one epsilon, and one delta subunit. The acetylcholine receptor changes subunit composition shortly after birth when the epsilon subunit replaces the gamma subunit seen in embryonic receptors. Mutations in the epsilon subunit are associated with congenital myasthenic syndrome. | CHRNE | ENSG00000108556 |
| glycoprotein 2 | 2813 | This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. | GP2 | ENSG00000169347 |
| brain enriched guanylate kinase associated | 57596 | NA | BEGAIN | ENSG00000183092 |
| coiled-coil domain containing 150 pseudogene 1 | ENSG00000256304 | NA | CCDC150P1 | ENSG00000256304 |
| colony stimulating factor 3 | 1440 | The protein encoded by this gene is a cytokine that controls the production, differentiation, and function of granulocytes. The active protein is found extracellularly. Alternatively spliced transcript variants have been described for this gene. | CSF3 | ENSG00000108342 |
| transmembrane protein 266 | 123591 | NA | TMEM266 | ENSG00000169758 |
| NA | ENSG00000255201 | NA | RP11-350N15.4 | ENSG00000255201 |
| BCL2 related protein A1 | 597 | This gene encodes a member of the BCL-2 protein family. The proteins of this family form hetero- or homodimers and act as anti- and pro-apoptotic regulators that are involved in a wide variety of cellular activities such as embryonic development, homeostasis and tumorigenesis. The protein encoded by this gene is able to reduce the release of pro-apoptotic cytochrome c from mitochondria and block caspase activation. This gene is a direct transcription target of NF-kappa B in response to inflammatory mediators, and is up-regulated by different extracellular signals, such as granulocyte-macrophage colony-stimulating factor (GM-CSF), CD40, phorbol ester and inflammatory cytokine TNF and IL-1, which suggests a cytoprotective function that is essential for lymphocyte activation as well as cell survival. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | BCL2A1 | ENSG00000140379 |
| NA | ENSG00000253785 | NA | CTC-308K20.3 | ENSG00000253785 |
| protease, serine 1 | 5644 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | PRSS1 | ENSG00000204983 |
| oculomedin | 10896 | The protein encoded by this gene is induced by cyclic mechanical stretching in trabecular cells of the eye and it is also expressed in retina. This protein may play a role in trabecular meshwork function and the development of glaucoma. | OCLM | ENSG00000262180 |
| serine peptidase inhibitor, Kazal type 1 | 6690 | The protein encoded by this gene is a trypsin inhibitor, which is secreted from pancreatic acinar cells into pancreatic juice. It is thought to function in the prevention of trypsin-catalyzed premature activation of zymogens within the pancreas and the pancreatic duct. Mutations in this gene are associated with hereditary pancreatitis and tropical calcific pancreatitis. | SPINK1 | ENSG00000164266 |
| alkaline ceramidase 2 | 340485 | The sphingolipid metabolite sphingosine-1-phosphate promotes cell proliferation and survival, whereas its precursor, sphingosine, has the opposite effect. The ceramidase ACER2 hydrolyzes very long chain ceramides to generate sphingosine (Xu et al., 2006 [PubMed 16940153]). | ACER2 | ENSG00000177076 |
| NA | ENSG00000261575 | NA | RP11-259G18.1 | ENSG00000261575 |
| NA | ENSG00000236364 | NA | RP11-525G13.2 | ENSG00000236364 |
| NA | ENSG00000259326 | NA | RP11-102L12.2 | ENSG00000259326 |
| early growth response 3 | 1960 | This gene encodes a transcriptional regulator that belongs to the EGR family of C2H2-type zinc-finger proteins. It is an immediate-early growth response gene which is induced by mitogenic stimulation. The protein encoded by this gene participates in the transcriptional regulation of genes in controling biological rhythm. It may also play a role in a wide variety of processes including muscle development, lymphocyte development, endothelial cell growth and migration, and neuronal development. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | EGR3 | ENSG00000179388 |
| NA | ENSG00000258895 | NA | CTD-2643K12.1 | ENSG00000258895 |
| phospholamban | 5350 | The protein encoded by this gene is found as a pentamer and is a major substrate for the cAMP-dependent protein kinase in cardiac muscle. The encoded protein is an inhibitor of cardiac muscle sarcoplasmic reticulum Ca(2+)-ATPase in the unphosphorylated state, but inhibition is relieved upon phosphorylation of the protein. The subsequent activation of the Ca(2+) pump leads to enhanced muscle relaxation rates, thereby contributing to the inotropic response elicited in heart by beta-agonists. The encoded protein is a key regulator of cardiac diastolic function. Mutations in this gene are a cause of inherited human dilated cardiomyopathy with refractory congestive heart failure, and also familial hypertrophic cardiomyopathy. | PLN | ENSG00000198523 |
| ankyrin repeat and SOCS box containing 2 | 51676 | This gene encodes a member of the ankyrin repeat and SOCS box-containing (ASB) protein family. These proteins play a role in protein degradation by coupling suppressor of cytokine signalling (SOCS) proteins with the elongin BC complex. The encoded protein is a subunit of a multimeric E3 ubiquitin ligase complex that mediates the degradation of actin-binding proteins. This gene plays a role in retinoic acid-induced growth inhibition and differentiation of myeloid leukemia cells. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | ASB2 | ENSG00000100628 |
| uncharacterized LOC101928399 | 101928399 | NA | TCONS_00029157 | ENSG00000237989 |
| polymerase (DNA) beta | 5423 | The protein encoded by this gene is a DNA polymerase involved in base excision and repair, also called gap-filling DNA synthesis. The encoded protein, acting as a monomer, is normally found in the cytoplasm, but it translocates to the nucleus upon DNA damage. Several transcript variants of this gene exist, but the full-length nature of only one has been described to date. | POLB | ENSG00000070501 |
| small nucleolar RNA, H/ACA box 64 | 26784 | NA | SNORA64 | ENSG00000207405 |
| angiomotin like 1 | 154810 | The protein encoded by this gene is a peripheral membrane protein that is a component of tight junctions or TJs. TJs form an apical junctional structure and act to control paracellular permeability and maintain cell polarity. This protein is related to angiomotin, an angiostatin binding protein that regulates endothelial cell migration and capillary formation. Two transcript variants encoding different isoforms have been found for this gene. | AMOTL1 | ENSG00000166025 |
| zinc finger protein 385C | 201181 | NA | ZNF385C | ENSG00000187595 |
| NA | ENSG00000255513 | NA | AC005363.9 | ENSG00000255513 |
| chromodomain helicase DNA binding protein 7 | 55636 | This gene encodes a protein that contains several helicase family domains. Mutations in this gene have been found in some patients with the CHARGE syndrome. Two transcript variants encoding different isoforms have been found for this gene. | CHD7 | ENSG00000171316 |
| NA | ENSG00000229299 | NA | RP4-583P15.10 | ENSG00000229299 |
| BCL2/adenovirus E1B 19kDa interacting protein 3 | 664 | This gene is encodes a mitochondrial protein that contains a BH3 domain and acts as a pro-apoptotic factor. The encoded protein interacts with anti-apoptotic proteins, including the E1B 19 kDa protein and Bcl2. This gene is silenced in tumors by DNA methylation. | BNIP3 | ENSG00000176171 |
| phospholipase A2 group IB | 5319 | This gene encodes a secreted member of the phospholipase A2 (PLA2) class of enzymes, which is produced by the pancreatic acinar cells. The encoded calcium-dependent enzyme catalyzes the hydrolysis of the sn-2 position of membrane glycerophospholipids to release arachidonic acid (AA) and lysophospholipids. AA is subsequently converted by downstream metabolic enzymes to several bioactive lipophilic compounds (eicosanoids), including prostaglandins (PGs) and leukotrienes (LTs). The enzyme may be involved in several physiological processes including cell contraction, cell proliferation and pathological response. | PLA2G1B | ENSG00000170890 |
| RNA, 7SK small nuclear pseudogene 70 | ENSG00000252464 | NA | RN7SKP70 | ENSG00000252464 |
| hypoxia inducible lipid droplet associated | 29923 | NA | HILPDA | ENSG00000135245 |
| NA | ENSG00000259407 | NA | RP11-158M2.3 | ENSG00000259407 |
| apolipoprotein L4 | 80832 | The protein encoded by this gene is a member of the apolipoprotein L family and may play a role in lipid exchange and transport throughout the body, as well as in reverse cholesterol transport from peripheral cells to the liver. Two transcript variants encoding two different isoforms have been found for this gene. Only one of the isoforms appears to be a secreted protein. | APOL4 | ENSG00000100336 |
| mesoderm specific transcript | 4232 | This gene encodes a member of the alpha/beta hydrolase superfamily. It is imprinted, exhibiting preferential expression from the paternal allele in fetal tissues, and isoform-specific imprinting in lymphocytes. The loss of imprinting of this gene has been linked to certain types of cancer and may be due to promotor switching. The encoded protein may play a role in development. Alternatively spliced transcript variants encoding multiple isoforms have been identified for this gene. Pseudogenes of this gene are located on the short arm of chromosomes 3 and 4, and the long arm of chromosomes 6 and 15. | MEST | ENSG00000106484 |
| NA | ENSG00000270890 | NA | RP3-468K18.6 | ENSG00000270890 |
| NA | ENSG00000256469 | NA | RP11-856F16.2 | ENSG00000256469 |
| ATPase Na+/K+ transporting subunit alpha 2 | 477 | The protein encoded by this gene belongs to the family of P-type cation transport ATPases, and to the subfamily of Na+/K+ -ATPases. Na+/K+ -ATPase is an integral membrane protein responsible for establishing and maintaining the electrochemical gradients of Na and K ions across the plasma membrane. These gradients are essential for osmoregulation, for sodium-coupled transport of a variety of organic and inorganic molecules, and for electrical excitability of nerve and muscle. This enzyme is composed of two subunits, a large catalytic subunit (alpha) and a smaller glycoprotein subunit (beta). The catalytic subunit of Na+/K+ -ATPase is encoded by multiple genes. This gene encodes an alpha 2 subunit. Mutations in this gene result in familial basilar or hemiplegic migraines, and in a rare syndrome known as alternating hemiplegia of childhood. | ATP1A2 | ENSG00000018625 |
| glutamate-ammonia ligase | 2752 | The protein encoded by this gene belongs to the glutamine synthetase family. It catalyzes the synthesis of glutamine from glutamate and ammonia in an ATP-dependent reaction. This protein plays a role in ammonia and glutamate detoxification, acid-base homeostasis, cell signaling, and cell proliferation. Glutamine is an abundant amino acid, and is important to the biosynthesis of several amino acids, pyrimidines, and purines. Mutations in this gene are associated with congenital glutamine deficiency, and overexpression of this gene was observed in some primary liver cancer samples. There are six pseudogenes of this gene found on chromosomes 2, 5, 9, 11, and 12. Alternative splicing results in multiple transcript variants. | GLUL | ENSG00000135821 |
| enhancer of zeste 2 polycomb repressive complex 2 subunit | 2146 | This gene encodes a member of the Polycomb-group (PcG) family. PcG family members form multimeric protein complexes, which are involved in maintaining the transcriptional repressive state of genes over successive cell generations. This protein associates with the embryonic ectoderm development protein, the VAV1 oncoprotein, and the X-linked nuclear protein. This protein may play a role in the hematopoietic and central nervous systems. Multiple alternatively splcied transcript variants encoding distinct isoforms have been identified for this gene. | EZH2 | ENSG00000106462 |
| transthyretin | 7276 | This gene encodes transthyretin, one of the three prealbumins including alpha-1-antitrypsin, transthyretin and orosomucoid. Transthyretin is a carrier protein; it transports thyroid hormones in the plasma and cerebrospinal fluid, and also transports retinol (vitamin A) in the plasma. The protein consists of a tetramer of identical subunits. More than 80 different mutations in this gene have been reported; most mutations are related to amyloid deposition, affecting predominantly peripheral nerve and/or the heart, and a small portion of the gene mutations is non-amyloidogenic. The diseases caused by mutations include amyloidotic polyneuropathy, euthyroid hyperthyroxinaemia, amyloidotic vitreous opacities, cardiomyopathy, oculoleptomeningeal amyloidosis, meningocerebrovascular amyloidosis, carpal tunnel syndrome, etc. | TTR | ENSG00000118271 |
| regulator of G-protein signaling 16 | 6004 | The protein encoded by this gene belongs to the ‘regulator of G protein signaling’ family. It inhibits signal transduction by increasing the GTPase activity of G protein alpha subunits. It also may play a role in regulating the kinetics of signaling in the phototransduction cascade. | RGS16 | ENSG00000143333 |
| small nucleolar RNA host gene 15 | ENSG00000232956 | NA | SNHG15 | ENSG00000232956 |
| laminin subunit alpha 3 | 3909 | The protein encoded by this gene belongs to the laminin family of secreted molecules. Laminins are heterotrimeric molecules that consist of alpha, beta, and gamma subunits that assemble through a coiled-coil domain. Laminins are essential for formation and function of the basement membrane and have additional functions in regulating cell migration and mechanical signal transduction. This gene encodes an alpha subunit and is responsive to several epithelial-mesenchymal regulators including keratinocyte growth factor, epidermal growth factor and insulin-like growth factor. Mutations in this gene have been identified as the cause of Herlitz type junctional epidermolysis bullosa and laryngoonychocutaneous syndrome. Alternative splicing and alternative promoter usage result in multiple transcript variants. | LAMA3 | ENSG00000053747 |
| zinc finger protein 878 | 729747 | NA | ZNF878 | ENSG00000257446 |
| general transcription factor IIi pseudogene 14 | ENSG00000226002 | NA | GTF2IP14 | ENSG00000226002 |
| NA | ENSG00000219470 | NA | RP3-337H4.6 | ENSG00000219470 |
| NA | ENSG00000258168 | NA | RP11-588H23.3 | ENSG00000258168 |
| cortexin 1 | 404217 | NA | CTXN1 | ENSG00000178531 |
| taste 2 receptor member 19 | 259294 | NA | TAS2R19 | ENSG00000212124 |
| transmembrane protein 120B | 144404 | NA | TMEM120B | ENSG00000188735 |
| guanine nucleotide binding protein-like 3 (nucleolar)-like pseudogene 1 | ENSG00000215032 | NA | GNL3LP1 | ENSG00000215032 |
| uncharacterized LOC101928371 | 101928371 | NA | LOC101928371 | ENSG00000225420 |
| family with sequence similarity 153 member C | ENSG00000204677 | NA | FAM153C | ENSG00000204677 |
| carnitine palmitoyltransferase 1B | 1375 | The protein encoded by this gene, a member of the carnitine/choline acetyltransferase family, is the rate-controlling enzyme of the long-chain fatty acid beta-oxidation pathway in muscle mitochondria. This enzyme is required for the net transport of long-chain fatty acyl-CoAs from the cytoplasm into the mitochondria. Multiple transcript variants encoding different isoforms have been found for this gene, and read-through transcripts are expressed from the upstream locus that include exons from this gene. | CPT1B | ENSG00000205560 |
| transmembrane protein 97 | 27346 | TMEM97 is a conserved integral membrane protein that plays a role in controlling cellular cholesterol levels (Bartz et al., 2009 [PubMed 19583955]). | TMEM97 | ENSG00000109084 |
| actin binding LIM protein 1 | 3983 | This gene encodes a cytoskeletal LIM protein that binds to actin filaments via a domain that is homologous to erythrocyte dematin. LIM domains, found in over 60 proteins, play key roles in the regulation of developmental pathways. LIM domains also function as protein-binding interfaces, mediating specific protein-protein interactions. The protein encoded by this gene could mediate such interactions between actin filaments and cytoplasmic targets. Alternatively spliced transcript variants encoding different isoforms have been identified. | ABLIM1 | ENSG00000099204 |
| zinc finger protein 841 | 284371 | NA | ZNF841 | ENSG00000197608 |
| pyrophosphatase (inorganic) 1 | 5464 | The protein encoded by this gene is a member of the inorganic pyrophosphatase (PPase) family. PPases catalyze the hydrolysis of pyrophosphate to inorganic phosphate, which is important for the phosphate metabolism of cells. Studies of a similar protein in bovine suggested a cytoplasmic localization of this enzyme. | PPA1 | ENSG00000180817 |
| nucleoredoxin | 64359 | This gene encodes a member of the thioredoxin superfamily, a group of small, multifunctional redox-active proteins. Members of this family are characterized by a conserved active motif called the thioredoxin fold that catalyzes disulfide bond formation and isomerization. The encoded protein acts a redox-dependent regulator of the Wnt signaling pathway and is involved in cell growth and differentiation. | NXN | ENSG00000167693 |
| metallothionein 2A | 4502 | NA | MT2A | ENSG00000125148 |
| tankyrase 1 binding protein 1 | 85456 | NA | TNKS1BP1 | ENSG00000149115 |
| TYRO3 protein tyrosine kinase | 7301 | The gene is part of a 3-member transmembrane receptor kinase receptor family with a processed pseudogene distal on chromosome 15. The encoded protein is activated by the products of the growth arrest-specific gene 6 and protein S genes and is involved in controlling cell survival and proliferation, spermatogenesis, immunoregulation and phagocytosis. The encoded protein has also been identified as a cell entry factor for Ebola and Marburg viruses. | TYRO3 | ENSG00000092445 |
| DEAD/H-box helicase 11 | 1663 | DEAD box proteins, characterized by the conserved motif Asp-Glu-Ala-Asp (DEAD), are putative RNA helicases. They are implicated in a number of cellular processes involving alteration of RNA secondary structure such as translation initiation, nuclear and mitochondrial splicing, and ribosome and spliceosome assembly. Based on their distribution patterns, some members of this family are believed to be involved in embryogenesis, spermatogenesis, and cellular growth and division. This gene encodes a DEAD box protein, which is an enzyme that possesses both ATPase and DNA helicase activities. This gene is a homolog of the yeast CHL1 gene, and may function to maintain chromosome transmission fidelity and genome stability. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | DDX11 | ENSG00000013573 |
| chymotrypsinogen B1 | 1504 | The protein encoded by this gene is one of a family of serine proteases that is secreted into the gastrointestinal tract as an inactive precursor, which is activated by proteolytic cleavage with trypsin. | CTRB1 | ENSG00000168925 |
| coiled-coil domain containing 144B (pseudogene) | 284047 | NA | CCDC144B | ENSG00000154874 |
| ATP binding cassette subfamily C member 4 | 10257 | The protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intra-cellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the MRP subfamily which is involved in multi-drug resistance. This family member plays a role in cellular detoxification as a pump for its substrate, organic anions. It may also function in prostaglandin-mediated cAMP signaling in ciliogenesis. Alternative splicing of this gene results in multiple transcript variants. | ABCC4 | ENSG00000125257 |
| regenerating family member 1 alpha | 5967 | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | REG1A | ENSG00000115386 |
| NA | ENSG00000246250 | NA | RP11-613D13.5 | ENSG00000246250 |
| NA | ENSG00000231628 | NA | RP3-355L5.4 | ENSG00000231628 |
| zinc finger and BTB domain containing 46 | 140685 | NA | ZBTB46 | ENSG00000130584 |
| islet cell autoantigen 1 | 3382 | This gene encodes a protein with an arfaptin homology domain that is found both in the cytosol and as membrane-bound form on the Golgi complex and immature secretory granules. This protein is believed to be an autoantigen in insulin-dependent diabetes mellitus and primary Sjogren’s syndrome. Several transcript variants encoding two different isoforms have been found for this gene. | ICA1 | ENSG00000003147 |
| ZDHHC20 intronic transcript 1 | ENSG00000236953 | NA | ZDHHC20-IT1 | ENSG00000236953 |
| thyroid hormone receptor interactor 10 | 9322 | NA | TRIP10 | ENSG00000125733 |
| NA | ENSG00000111788 | NA | RP11-22B23.1 | ENSG00000111788 |
| pentraxin 3 | 5806 | NA | PTX3 | ENSG00000163661 |
| pleckstrin homology like domain family A member 3 | 23612 | NA | PHLDA3 | ENSG00000174307 |
| nucleophosmin 1 (nucleolar phosphoprotein B23, numatrin) pseudogene 37 | ENSG00000219085 | NA | NPM1P37 | ENSG00000219085 |
| heterogeneous nuclear ribonucleoprotein A3 pseudogene 9 | ENSG00000270903 | NA | HNRNPA3P9 | ENSG00000270903 |
| GTF2I repeat domain containing 1 | 9569 | The protein encoded by this gene contains five GTF2I-like repeats and each repeat possesses a potential helix-loop-helix (HLH) motif. It may have the ability to interact with other HLH-proteins and function as a transcription factor or as a positive transcriptional regulator under the control of Retinoblastoma protein. This gene plays a role in craniofacial and cognitive development and mutations have been associated with Williams-Beuren syndrome, a multisystem developmental disorder caused by deletion of multiple genes at 7q11.23. Alternative splicing results in multiple transcript variants. | GTF2IRD1 | ENSG00000006704 |
| family with sequence similarity 43 member A | 131583 | NA | FAM43A | ENSG00000185112 |
| tropomyosin 3 pseudogene 6 | ENSG00000250731 | NA | TPM3P6 | ENSG00000250731 |
| Rho related BTB domain containing 1 | 9886 | The protein encoded by this gene belongs to the Rho family of the small GTPase superfamily. It contains a GTPase domain, a proline-rich region, a tandem of 2 BTB (broad complex, tramtrack, and bric-a-brac) domains, and a conserved C-terminal region. The protein plays a role in small GTPase-mediated signal transduction and the organization of the actin filament system. Alternate splicing results in multiple transcript variants. | RHOBTB1 | ENSG00000072422 |
| NA | ENSG00000270075 | NA | RP11-127L20.5 | ENSG00000270075 |
| potassium voltage-gated channel subfamily J member 12 | 3768 | This gene encodes an inwardly rectifying K+ channel which may be blocked by divalent cations. This protein is thought to be one of multiple inwardly rectifying channels which contribute to the cardiac inward rectifier current (IK1). The gene is located within the Smith-Magenis syndrome region on chromosome 17. | KCNJ12 | ENSG00000184185 |
| erythrocyte membrane protein band 4.1 | 2035 | The protein encoded by this gene, together with spectrin and actin, constitute the red cell membrane cytoskeletal network. This complex plays a critical role in erythrocyte shape and deformability. Mutations in this gene are associated with type 1 elliptocytosis (EL1). Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | EPB41 | ENSG00000159023 |
| ATP binding cassette subfamily G member 1 | 9619 | The protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intra-cellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the White subfamily. It is involved in macrophage cholesterol and phospholipids transport, and may regulate cellular lipid homeostasis in other cell types. Six alternative splice variants have been identified. | ABCG1 | ENSG00000160179 |
| uncoupling protein 3 | 7352 | Mitochondrial uncoupling proteins (UCP) are members of the larger family of mitochondrial anion carrier proteins (MACP). UCPs separate oxidative phosphorylation from ATP synthesis with energy dissipated as heat, also referred to as the mitochondrial proton leak. UCPs facilitate the transfer of anions from the inner to the outer mitochondrial membrane and the return transfer of protons from the outer to the inner mitochondrial membrane. They also reduce the mitochondrial membrane potential in mammalian cells. The different UCPs have tissue-specific expression; this gene is primarily expressed in skeletal muscle. This gene’s protein product is postulated to protect mitochondria against lipid-induced oxidative stress. Expression levels of this gene increase when fatty acid supplies to mitochondria exceed their oxidation capacity and the protein enables the export of fatty acids from mitochondria. UCPs contain the three solcar protein domains typically found in MACPs. Two splice variants have been found for this gene. | UCP3 | ENSG00000175564 |
| pancreatic lipase | 5406 | This gene is a member of the lipase gene family. It encodes a carboxyl esterase that hydrolyzes insoluble, emulsified triglycerides, and is essential for the efficient digestion of dietary fats. This gene is expressed specifically in the pancreas. | PNLIP | ENSG00000175535 |
| NA | ENSG00000243829 | NA | CTB-33G10.1 | ENSG00000243829 |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",3,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[4,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | query | summary | name | X_id | notfound |
|---|---|---|---|---|---|
| IL6 | ENSG00000136244 | This gene encodes a cytokine that functions in inflammation and the maturation of B cells. In addition, the encoded protein has been shown to be an endogenous pyrogen capable of inducing fever in people with autoimmune diseases or infections. The protein is primarily produced at sites of acute and chronic inflammation, where it is secreted into the serum and induces a transcriptional inflammatory response through interleukin 6 receptor, alpha. The functioning of this gene is implicated in a wide variety of inflammation-associated disease states, including suspectibility to diabetes mellitus and systemic juvenile rheumatoid arthritis. Alternative splicing results in multiple transcript variants. | interleukin 6 | 3569 | NA |
| CSF3 | ENSG00000108342 | The protein encoded by this gene is a cytokine that controls the production, differentiation, and function of granulocytes. The active protein is found extracellularly. Alternatively spliced transcript variants have been described for this gene. | colony stimulating factor 3 | 1440 | NA |
| CXCL8 | ENSG00000169429 | The protein encoded by this gene is a member of the CXC chemokine family. This chemokine is one of the major mediators of the inflammatory response. This chemokine is secreted by several cell types. It functions as a chemoattractant, and is also a potent angiogenic factor. This gene is believed to play a role in the pathogenesis of bronchiolitis, a common respiratory tract disease caused by viral infection. This gene and other ten members of the CXC chemokine gene family form a chemokine gene cluster in a region mapped to chromosome 4q. | C-X-C motif chemokine ligand 8 | 3576 | NA |
| SPRR2E | ENSG00000203785 | This gene encodes a member of a family of small proline-rich proteins clustered in the epidermal differentiation complex on chromosome 1q21. The encoded protein, along with other family members, is a component of the cornified cell envelope that forms beneath the plasma membrane in terminally differentiated stratified squamous epithelia. This envelope serves as a barrier against extracellular and environmental factors. The seven SPRR2 genes (A-G) appear to have been homogenized by gene conversion compared to others in the cluster that exhibit greater differences in protein structure. | small proline rich protein 2E | 6704 | NA |
| PPP1R1A | ENSG00000135447 | NA | protein phosphatase 1 regulatory inhibitor subunit 1A | 5502 | NA |
| SLURP1 | ENSG00000126233 | The protein encoded by this gene is a member of the Ly6/uPAR family but lacks a GPI-anchoring signal sequence. It is thought that this secreted protein contains antitumor activity. Mutations in this gene have been associated with Mal de Meleda, a rare autosomal recessive skin disorder. This gene maps to the same chromosomal region as several members of the Ly6/uPAR family of glycoprotein receptors. | secreted LY6/PLAUR domain containing 1 | 57152 | NA |
| ATP1B1 | ENSG00000143153 | The protein encoded by this gene belongs to the family of Na+/K+ and H+/K+ ATPases beta chain proteins, and to the subfamily of Na+/K+ -ATPases. Na+/K+ -ATPase is an integral membrane protein responsible for establishing and maintaining the electrochemical gradients of Na and K ions across the plasma membrane. These gradients are essential for osmoregulation, for sodium-coupled transport of a variety of organic and inorganic molecules, and for electrical excitability of nerve and muscle. This enzyme is composed of two subunits, a large catalytic subunit (alpha) and a smaller glycoprotein subunit (beta). The beta subunit regulates, through assembly of alpha/beta heterodimers, the number of sodium pumps transported to the plasma membrane. The glycoprotein subunit of Na+/K+ -ATPase is encoded by multiple genes. This gene encodes a beta 1 subunit. Alternatively spliced transcript variants encoding different isoforms have been described, but their biological validity is not known. | ATPase Na+/K+ transporting subunit beta 1 | 481 | NA |
| SLC1A1 | ENSG00000106688 | This gene encodes a member of the high-affinity glutamate transporters that play an essential role in transporting glutamate across plasma membranes. In brain, these transporters are crucial in terminating the postsynaptic action of the neurotransmitter glutamate, and in maintaining extracellular glutamate concentrations below neurotoxic levels. This transporter also transports aspartate, and mutations in this gene are thought to cause dicarboxylicamino aciduria, also known as glutamate-aspartate transport defect. | solute carrier family 1 member 1 | 6505 | NA |
| NA | ENSG00000179294 | NA | NA | NA | TRUE |
| GPRC5B | ENSG00000167191 | This gene encodes a member of the type 3 G protein-coupled receptor family. Members of this superfamily are characterized by a signature 7-transmembrane domain motif. The encoded protein may modulate insulin secretion and increased protein expression is associated with type 2 diabetes. Alternative splicing results in multiple transcript variants. | G protein-coupled receptor class C group 5 member B | 51704 | NA |
| NTRK1 | ENSG00000198400 | This gene encodes a member of the neurotrophic tyrosine kinase receptor (NTKR) family. This kinase is a membrane-bound receptor that, upon neurotrophin binding, phosphorylates itself and members of the MAPK pathway. The presence of this kinase leads to cell differentiation and may play a role in specifying sensory neuron subtypes. Mutations in this gene have been associated with congenital insensitivity to pain, anhidrosis, self-mutilating behavior, mental retardation and cancer. Alternate transcriptional splice variants of this gene have been found, but only three have been characterized to date. | neurotrophic receptor tyrosine kinase 1 | 4914 | NA |
| SOCS3 | ENSG00000184557 | This gene encodes a member of the STAT-induced STAT inhibitor (SSI), also known as suppressor of cytokine signaling (SOCS), family. SSI family members are cytokine-inducible negative regulators of cytokine signaling. The expression of this gene is induced by various cytokines, including IL6, IL10, and interferon (IFN)-gamma. The protein encoded by this gene can bind to JAK2 kinase, and inhibit the activity of JAK2 kinase. Studies of the mouse counterpart of this gene suggested the roles of this gene in the negative regulation of fetal liver hematopoiesis, and placental development. | suppressor of cytokine signaling 3 | 9021 | NA |
| RP11-845C23.3 | ENSG00000267396 | NA | NA | ENSG00000267396 | NA |
| SPX | ENSG00000134548 | The protein encoded by this gene is a hormone involved in modulation of cardiovascular and renal function. It has also been shown in rats to cause weight loss. Several transcript variants have been found for this gene. | spexin hormone | 80763 | NA |
| SPAG4 | ENSG00000061656 | The mammalian sperm flagellum contains two cytoskeletal structures associated with the axoneme: the outer dense fibers surrounding the axoneme in the midpiece and principal piece and the fibrous sheath surrounding the outer dense fibers in the principal piece of the tail. Defects in these structures are associated with abnormal tail morphology, reduced sperm motility, and infertility. In the rat, the protein encoded by this gene associates with an outer dense fiber protein via a leucine zipper motif and localizes to the microtubules of the manchette and axoneme during sperm tail development. Alternative splicing results in multiple transcript variants encoding different isoforms. | sperm associated antigen 4 | 6676 | NA |
| KRT14 | ENSG00000186847 | This gene encodes a member of the keratin family, the most diverse group of intermediate filaments. This gene product, a type I keratin, is usually found as a heterotetramer with two keratin 5 molecules, a type II keratin. Together they form the cytoskeleton of epithelial cells. Mutations in the genes for these keratins are associated with epidermolysis bullosa simplex. At least one pseudogene has been identified at 17p12-p11. | keratin 14 | 3861 | NA |
| RP11-334E6.12 | ENSG00000263873 | NA | NA | ENSG00000263873 | NA |
| RBP1 | ENSG00000114115 | This gene encodes the carrier protein involved in the transport of retinol (vitamin A alcohol) from the liver storage site to peripheral tissue. Vitamin A is a fat-soluble vitamin necessary for growth, reproduction, differentiation of epithelial tissues, and vision. Multiple transcript variants encoding different isoforms have been found for this gene. | retinol binding protein 1 | 5947 | NA |
| SRSF12 | ENSG00000154548 | NA | serine and arginine rich splicing factor 12 | 135295 | NA |
| SPRR1A | ENSG00000169474 | NA | small proline rich protein 1A | 6698 | NA |
| THY1 | ENSG00000154096 | This gene encodes a cell surface glycoprotein and member of the immunoglobulin superfamily of proteins. The encoded protein is involved in cell adhesion and cell communication in numerous cell types, but particularly in cells of the immune and nervous systems. The encoded protein is widely used as a marker for hematopoietic stem cells. This gene may function as a tumor suppressor in nasopharyngeal carcinoma. Alternative splicing results in multiple transcript variants. | Thy-1 cell surface antigen | 7070 | NA |
| LIF | ENSG00000128342 | The protein encoded by this gene is a pleiotropic cytokine with roles in several different systems. It is involved in the induction of hematopoietic differentiation in normal and myeloid leukemia cells, induction of neuronal cell differentiation, regulator of mesenchymal to epithelial conversion during kidney development, and may also have a role in immune tolerance at the maternal-fetal interface. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | leukemia inhibitory factor | 3976 | NA |
| RAMP1 | ENSG00000132329 | The protein encoded by this gene is a member of the RAMP family of single-transmembrane-domain proteins, called receptor (calcitonin) activity modifying proteins (RAMPs). RAMPs are type I transmembrane proteins with an extracellular N terminus and a cytoplasmic C terminus. RAMPs are required to transport calcitonin-receptor-like receptor (CRLR) to the plasma membrane. CRLR, a receptor with seven transmembrane domains, can function as either a calcitonin-gene-related peptide (CGRP) receptor or an adrenomedullin receptor, depending on which members of the RAMP family are expressed. In the presence of this (RAMP1) protein, CRLR functions as a CGRP receptor. The RAMP1 protein is involved in the terminal glycosylation, maturation, and presentation of the CGRP receptor to the cell surface. Alternative splicing results in multiple transcript variants encoding different isoforms. | receptor activity modifying protein 1 | 10267 | NA |
| LRRC2 | ENSG00000163827 | This gene encodes a member of the leucine-rich repeat-containing family of proteins, which function in diverse biological pathways. This family member may possibly be a nuclear protein. Similarity to the RAS suppressor protein, as well as expression down-regulation observed in tumor cells, suggests that it may function as a tumor suppressor. The gene is located in the chromosome 3 common eliminated region 1 (C3CER1), a 1.4 Mb region that is commonly deleted in diverse tumors. A related pseudogene has been identified on chromosome 2. | leucine rich repeat containing 2 | 79442 | NA |
| LGALS7B | ENSG00000178934 | The galectins are a family of beta-galactoside-binding proteins implicated in modulating cell-cell and cell-matrix interactions. Differential and in situ hybridization studies indicate that this lectin is specifically expressed in keratinocytes and found mainly in stratified squamous epithelium. A duplicate copy of this gene (GeneID:3963) is found adjacent to, but on the opposite strand on chromosome 19. | galectin 7B | 653499 | NA |
| EPB41L4B | ENSG00000095203 | NA | erythrocyte membrane protein band 4.1 like 4B | 54566 | NA |
| MESP1 | ENSG00000166823 | NA | mesoderm posterior bHLH transcription factor 1 | 55897 | NA |
| BAALC | ENSG00000164929 | This gene was identified by gene expression studies in patients with acute myeloid leukemia (AML). The gene is conserved among mammals and is not found in lower organisms. Tissues that express this gene develop from the neuroectoderm. Multiple alternatively spliced transcript variants that encode different proteins have been described for this gene; however, some of the transcript variants are found only in AML cell lines. | brain and acute leukemia, cytoplasmic | 79870 | NA |
| SPRR1B | ENSG00000169469 | The protein encoded by this gene is an envelope protein of keratinocytes. The encoded protein is crosslinked to membrane proteins by transglutaminase, forming an insoluble layer under the plasma membrane. This protein is proline-rich and contains several tandem amino acid repeats. | small proline rich protein 1B | 6699 | NA |
| RP3-414A15.10 | ENSG00000258603 | NA | NA | ENSG00000258603 | NA |
| KIAA1161 | ENSG00000164976 | NA | KIAA1161 | 57462 | NA |
| IL32 | ENSG00000008517 | This gene encodes a member of the cytokine family. The protein contains a tyrosine sulfation site, 3 potential N-myristoylation sites, multiple putative phosphorylation sites, and an RGD cell-attachment sequence. Expression of this protein is increased after the activation of T-cells by mitogens or the activation of NK cells by IL-2. This protein induces the production of TNFalpha from macrophage cells. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. | interleukin 32 | 9235 | NA |
| NKX3-1 | ENSG00000167034 | This gene encodes a homeobox-containing transcription factor. This transcription factor functions as a negative regulator of epithelial cell growth in prostate tissue. Aberrant expression of this gene is associated with prostate tumor progression. Alternate splicing results in multiple transcript variants of this gene. | NK3 homeobox 1 | 4824 | NA |
| ANGPTL4 | ENSG00000167772 | This gene encodes a glycosylated, secreted protein containing a C-terminal fibrinogen domain. The encoded protein is induced by peroxisome proliferation activators and functions as a serum hormone that regulates glucose homeostasis, lipid metabolism, and insulin sensitivity. This protein can also act as an apoptosis survival factor for vascular endothelial cells and can prevent metastasis by inhibiting vascular growth and tumor cell invasion. The C-terminal domain may be proteolytically-cleaved from the full-length secreted protein. Decreased expression of this gene has been associated with type 2 diabetes. Alternative splicing results in multiple transcript variants. This gene was previously referred to as ANGPTL2 but has been renamed ANGPTL4. | angiopoietin like 4 | 51129 | NA |
| S100A7 | ENSG00000143556 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein differs from the other S100 proteins of known structure in its lack of calcium binding ability in one EF-hand at the N-terminus. The protein is overexpressed in hyperproliferative skin diseases, exhibits antimicrobial activities against bacteria and induces immunomodulatory activities. | S100 calcium binding protein A7 | 6278 | NA |
| CXCL3 | ENSG00000163734 | This antimicrobial gene encodes a member of the CXC subfamily of chemokines. The encoded protein is a secreted growth factor that signals through the G-protein coupled receptor, CXC receptor 2. This protein plays a role in inflammation and as a chemoattractant for neutrophils. | C-X-C motif chemokine ligand 3 | 2921 | NA |
| RAB26 | ENSG00000167964 | Members of the RAB protein family, including RAB26, are important regulators of vesicular fusion and trafficking. The RAB family of small G proteins regulates intercellular vesicle trafficking, including exocytosis, endocytosis, and recycling (summary by Seki et al., 2000 [PubMed 11043516]). | RAB26, member RAS oncogene family | 25837 | NA |
| DDAH1 | ENSG00000153904 | This gene belongs to the dimethylarginine dimethylaminohydrolase (DDAH) gene family. The encoded enzyme plays a role in nitric oxide generation by regulating cellular concentrations of methylarginines, which in turn inhibit nitric oxide synthase activity. | dimethylarginine dimethylaminohydrolase 1 | 23576 | NA |
| CXCL2 | ENSG00000081041 | This antimicrobial gene is part of a chemokine superfamily that encodes secreted proteins involved in immunoregulatory and inflammatory processes. The superfamily is divided into four subfamilies based on the arrangement of the N-terminal cysteine residues of the mature peptide. This chemokine, a member of the CXC subfamily, is expressed at sites of inflammation and may suppress hematopoietic progenitor cell proliferation. | C-X-C motif chemokine ligand 2 | 2920 | NA |
| IL20RB | ENSG00000174564 | IL20RB and IL20RA (MIM 605620) form a heterodimeric receptor for interleukin-20 (IL20; MIM 605619) (Blumberg et al., 2001 [PubMed 11163236]). | interleukin 20 receptor subunit beta | 53833 | NA |
| EGR1 | ENSG00000120738 | The protein encoded by this gene belongs to the EGR family of C2H2-type zinc-finger proteins. It is a nuclear protein and functions as a transcriptional regulator. The products of target genes it activates are required for differentitation and mitogenesis. Studies suggest this is a cancer suppressor gene. | early growth response 1 | 1958 | NA |
| RP11-396F22.1 | ENSG00000257718 | NA | NA | ENSG00000257718 | NA |
| CRCT1 | ENSG00000169509 | NA | cysteine rich C-terminal 1 | 54544 | NA |
| SLC4A4 | ENSG00000080493 | This gene encodes a sodium bicarbonate cotransporter (NBC) involved in the regulation of bicarbonate secretion and absorption and intracellular pH. Mutations in this gene are associated with proximal renal tubular acidosis. Multiple transcript variants encoding different isoforms have been found for this gene. | solute carrier family 4 member 4 | 8671 | NA |
| CALML5 | ENSG00000178372 | This gene encodes a novel calcium binding protein expressed in the epidermis and related to the calmodulin family of calcium binding proteins. Functional studies with recombinant protein demonstrate it does bind calcium and undergoes a conformational change when it does so. Abundant expression is detected only in reconstructed epidermis and is restricted to differentiating keratinocytes. In addition, it can associate with transglutaminase 3, shown to be a key enzyme in the terminal differentiation of keratinocytes. | calmodulin like 5 | 51806 | NA |
| SLC25A18 | ENSG00000182902 | NA | solute carrier family 25 member 18 | 83733 | NA |
| PNLIP | ENSG00000175535 | This gene is a member of the lipase gene family. It encodes a carboxyl esterase that hydrolyzes insoluble, emulsified triglycerides, and is essential for the efficient digestion of dietary fats. This gene is expressed specifically in the pancreas. | pancreatic lipase | 5406 | NA |
| SHANK3 | ENSG00000251322 | NA | SH3 and multiple ankyrin repeat domains 3 | ENSG00000251322 | NA |
| STMN2 | ENSG00000104435 | This gene encodes a member of the stathmin family of phosphoproteins. Stathmin proteins function in microtubule dynamics and signal transduction. The encoded protein plays a regulatory role in neuronal growth and is also thought to be involved in osteogenesis. Reductions in the expression of this gene have been associated with Down’s syndrome and Alzheimer’s disease. Alternatively spliced transcript variants have been observed for this gene. A pseudogene of this gene is located on the long arm of chromosome 6. | stathmin 2 | 11075 | NA |
| NNAT | ENSG00000053438 | The protein encoded by this gene is a proteolipid that may be involved in the regulation of ion channels during brain development. The encoded protein may also play a role in forming and maintaining the structure of the nervous system. This gene is found within an intron of another gene, bladder cancer associated protein, but on the opposite strand. This gene is imprinted and is expressed only from the paternal allele. | neuronatin | 4826 | NA |
| SLC29A2 | ENSG00000174669 | The uptake of nucleosides by transporters, such as SLC29A2, is essential for nucleotide synthesis by salvage pathways in cells that lack de novo biosynthetic pathways. Nucleoside transport also plays a key role in the regulation of many physiologic processes through its effect on adenosine concentration at the cell surface (Griffiths et al., 1997 [PubMed 9396714]). | solute carrier family 29 member 2 | 3177 | NA |
| KCNJ12 | ENSG00000184185 | This gene encodes an inwardly rectifying K+ channel which may be blocked by divalent cations. This protein is thought to be one of multiple inwardly rectifying channels which contribute to the cardiac inward rectifier current (IK1). The gene is located within the Smith-Magenis syndrome region on chromosome 17. | potassium voltage-gated channel subfamily J member 12 | 3768 | NA |
| SLCO4A1 | ENSG00000101187 | NA | solute carrier organic anion transporter family member 4A1 | 28231 | NA |
| OLAH | ENSG00000152463 | NA | oleoyl-ACP hydrolase | 55301 | NA |
| UCHL1 | ENSG00000154277 | The protein encoded by this gene belongs to the peptidase C12 family. This enzyme is a thiol protease that hydrolyzes a peptide bond at the C-terminal glycine of ubiquitin. This gene is specifically expressed in the neurons and in cells of the diffuse neuroendocrine system. Mutations in this gene may be associated with Parkinson disease. | ubiquitin C-terminal hydrolase L1 | 7345 | NA |
| IL12A | ENSG00000168811 | This gene encodes a subunit of a cytokine that acts on T and natural killer cells, and has a broad array of biological activities. The cytokine is a disulfide-linked heterodimer composed of the 35-kD subunit encoded by this gene, and a 40-kD subunit that is a member of the cytokine receptor family. This cytokine is required for the T-cell-independent induction of interferon (IFN)-gamma, and is important for the differentiation of both Th1 and Th2 cells. The responses of lymphocytes to this cytokine are mediated by the activator of transcription protein STAT4. Nitric oxide synthase 2A (NOS2A/NOS2) is found to be required for the signaling process of this cytokine in innate immunity. | interleukin 12A | 3592 | NA |
| SLC29A4 | ENSG00000164638 | This gene encodes a member of the SLC29A/ENT transporter protein family. The encoded membrane protein catalyzes the reuptake of monoamines into presynaptic neurons, thus determining the intensity and duration of monoamine neural signaling. It has been shown to transport several compounds, including serotonin, dopamine, and the neurotoxin 1-methyl-4-phenylpyridinium. Alternative splicing results in multiple transcript variants. | solute carrier family 29 member 4 | 222962 | NA |
| ARC | ENSG00000198576 | NA | activity-regulated cytoskeleton-associated protein | 23237 | NA |
| PFN2 | ENSG00000070087 | The protein encoded by this gene is a ubiquitous actin monomer-binding protein belonging to the profilin family. It is thought to regulate actin polymerization in response to extracellular signals. There are two alternatively spliced transcript variants encoding different isoforms described for this gene. | profilin 2 | 5217 | NA |
| APOC1 | ENSG00000130208 | This gene encodes a member of the apolipoprotein C1 family. This gene is expressed primarily in the liver, and it is activated when monocytes differentiate into macrophages. The encoded protein plays a central role in high density lipoprotein (HDL) and very low density lipoprotein (VLDL) metabolism. This protein has also been shown to inhibit cholesteryl ester transfer protein in plasma. A pseudogene of this gene is located 4 kb downstream in the same orientation, on the same chromosome. This gene is mapped to chromosome 19, where it resides within a apolipoprotein gene cluster. | apolipoprotein C1 | 341 | NA |
| KRT1 | ENSG00000167768 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | keratin 1 | 3848 | NA |
| PITPNC1 | ENSG00000154217 | This gene encodes a member of the phosphatidylinositol transfer protein family. The encoded cytoplasmic protein plays a role in multiple processes including cell signaling and lipid metabolism by facilitating the transfer of phosphatidylinositol between membrane compartments. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene, and a pseudogene of this gene is located on the long arm of chromosome 1. | phosphatidylinositol transfer protein, cytoplasmic 1 | 26207 | NA |
| BATF3 | ENSG00000123685 | This gene encodes a member of the basic leucine zipper protein family. The encoded protein functions as a transcriptional repressor when heterodimerizing with JUN. The protein may play a role in repression of interleukin-2 and matrix metalloproteinase-1 transcription. | basic leucine zipper ATF-like transcription factor 3 | 55509 | NA |
| REG3A | ENSG00000172016 | This gene encodes a pancreatic secretory protein that may be involved in cell proliferation or differentiation. It has similarity to the C-type lectin superfamily. The enhanced expression of this gene is observed during pancreatic inflammation and liver carcinogenesis. The mature protein also functions as an antimicrobial protein with antibacterial activity. Alternate splicing results in multiple transcript variants that encode the same protein. | regenerating family member 3 alpha | 5068 | NA |
| ITPKA | ENSG00000137825 | Regulates inositol phosphate metabolism by phosphorylation of second messenger inositol 1,4,5-trisphosphate to Ins(1,3,4,5)P4. The activity of the inositol 1,4,5-trisphosphate 3-kinase is responsible for regulating the levels of a large number of inositol polyphosphates that are important in cellular signaling. Both calcium/calmodulin and protein phosphorylation mechanisms control its activity. It is also a substrate for the cyclic AMP-dependent protein kinase, calcium/calmodulin- dependent protein kinase II, and protein kinase C in vitro. | inositol-trisphosphate 3-kinase A | 3706 | NA |
| LY6D | ENSG00000167656 | NA | lymphocyte antigen 6 complex, locus D | 8581 | NA |
| CXCL1 | ENSG00000163739 | This antimicrobial gene encodes a member of the CXC subfamily of chemokines. The encoded protein is a secreted growth factor that signals through the G-protein coupled receptor, CXC receptor 2. This protein plays a role in inflammation and as a chemoattractant for neutrophils. Aberrant expression of this protein is associated with the growth and progression of certain tumors. A naturally occurring processed form of this protein has increased chemotactic activity. Alternate splicing results in coding and non-coding variants of this gene. A pseudogene of this gene is found on chromosome 4. | C-X-C motif chemokine ligand 1 | 2919 | NA |
| TOX2 | ENSG00000124191 | NA | TOX high mobility group box family member 2 | 84969 | NA |
| KLF10 | ENSG00000155090 | This gene encodes a member of a family of proteins that feature C2H2-type zinc finger domains. The encoded protein is a transcriptional repressor that acts as an effector of transforming growth factor beta signaling. Activity of this protein may inhibit the growth of cancers, particularly pancreatic cancer. Alternative splicing results in multiple transcript variants. | Kruppel like factor 10 | 7071 | NA |
| SLCO4A1-AS1 | ENSG00000232803 | NA | SLCO4A1 antisense RNA 1 | 100127888 | NA |
| COX7A1 | ENSG00000161281 | Cytochrome c oxidase (COX), the terminal component of the mitochondrial respiratory chain, catalyzes the electron transfer from reduced cytochrome c to oxygen. This component is a heteromeric complex consisting of 3 catalytic subunits encoded by mitochondrial genes and multiple structural subunits encoded by nuclear genes. The mitochondrially-encoded subunits function in electron transfer, and the nuclear-encoded subunits may function in the regulation and assembly of the complex. This nuclear gene encodes polypeptide 1 (muscle isoform) of subunit VIIa and the polypeptide 1 is present only in muscle tissues. Other polypeptides of subunit VIIa are present in both muscle and nonmuscle tissues, and are encoded by different genes. | cytochrome c oxidase subunit 7A1 | 1346 | NA |
| FAM107A | ENSG00000168309 | NA | family with sequence similarity 107 member A | 11170 | NA |
| NA | ENSG00000267473 | NA | NA | NA | TRUE |
| SEZ6L2 | ENSG00000174938 | This gene encodes a seizure-related protein that is localized on the cell surface. The gene is located in a region of chromosome 16p11.2 that is thought to contain candidate genes for autism spectrum disorders (ASD), though there is no evidence directly implicating this gene in ASD. Increased expression of this gene has been found in lung cancers, and the protein is therefore considered to be a novel prognostic marker for lung cancer. Alternative splicing of this gene results in multiple transcript variants. | seizure related 6 homolog like 2 | 26470 | NA |
| TMEM178A | ENSG00000152154 | NA | transmembrane protein 178A | 130733 | NA |
| GPD1 | ENSG00000167588 | This gene encodes a member of the NAD-dependent glycerol-3-phosphate dehydrogenase family. The encoded protein plays a critical role in carbohydrate and lipid metabolism by catalyzing the reversible conversion of dihydroxyacetone phosphate (DHAP) and reduced nicotine adenine dinucleotide (NADH) to glycerol-3-phosphate (G3P) and NAD+. The encoded cytosolic protein and mitochondrial glycerol-3-phosphate dehydrogenase also form a glycerol phosphate shuttle that facilitates the transfer of reducing equivalents from the cytosol to mitochondria. Mutations in this gene are a cause of transient infantile hypertriglyceridemia. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | glycerol-3-phosphate dehydrogenase 1 | 2819 | NA |
| SMIM5 | ENSG00000204323 | NA | small integral membrane protein 5 | 643008 | NA |
| CA3 | ENSG00000164879 | Carbonic anhydrase III (CAIII) is a member of a multigene family (at least six separate genes are known) that encodes carbonic anhydrase isozymes. These carbonic anhydrases are a class of metalloenzymes that catalyze the reversible hydration of carbon dioxide and are differentially expressed in a number of cell types. The expression of the CA3 gene is strictly tissue specific and present at high levels in skeletal muscle and much lower levels in cardiac and smooth muscle. A proportion of carriers of Duchenne muscle dystrophy have a higher CA3 level than normal. The gene spans 10.3 kb and contains seven exons and six introns. | carbonic anhydrase 3 | 761 | NA |
| JPH1 | ENSG00000104369 | Junctional complexes between the plasma membrane and endoplasmic/sarcoplasmic reticulum are a common feature of all excitable cell types and mediate cross talk between cell surface and intracellular ion channels. The protein encoded by this gene is a component of junctional complexes and is composed of a C-terminal hydrophobic segment spanning the endoplasmic/sarcoplasmic reticulum membrane and a remaining cytoplasmic domain that shows specific affinity for the plasma membrane. This gene is a member of the junctophilin gene family. | junctophilin 1 | 56704 | NA |
| SLC25A22 | ENSG00000177542 | This gene encodes a mitochondrial glutamate carrier. Mutations in this gene are associated with early infantile epileptic encephalopathy. Multiple alternatively spliced variants, encoding the same protein, have been identified. | solute carrier family 25 member 22 | 79751 | NA |
| FAM153B | ENSG00000182230 | NA | family with sequence similarity 153 member B | 202134 | NA |
| LOC100507387 | ENSG00000182230 | NA | uncharacterized LOC100507387 | 100507387 | NA |
| RP11-490M8.1 | ENSG00000260025 | NA | NA | ENSG00000260025 | NA |
| ICAM5 | ENSG00000105376 | The protein encoded by this gene is a member of the intercellular adhesion molecule (ICAM) family. All ICAM proteins are type I transmembrane glycoproteins, contain 2-9 immunoglobulin-like C2-type domains, and bind to the leukocyte adhesion LFA-1 protein. This protein is expressed on the surface of telencephalic neurons and displays two types of adhesion activity, homophilic binding between neurons and heterophilic binding between neurons and leukocytes. It may be a critical component in neuron-microglial cell interactions in the course of normal development or as part of neurodegenerative diseases. | intercellular adhesion molecule 5 | 7087 | NA |
| NOCT | ENSG00000151014 | The protein encoded by this gene is highly similar to Nocturnin, a gene identified as a circadian clock regulated gene in Xenopus laevis. This protein and Nocturnin protein share similarity with the C-terminal domain of a yeast transcription factor, carbon catabolite repression 4 (CCR4). The mRNA abundance of a similar gene in mouse has been shown to exhibit circadian rhythmicity, which suggests a role for this protein in clock function or as a circadian clock effector. | nocturnin | 25819 | NA |
| CYP27B1 | ENSG00000111012 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. The protein encoded by this gene localizes to the inner mitochondrial membrane where it hydroxylates 25-hydroxyvitamin D3 at the 1alpha position. This reaction synthesizes 1alpha,25-dihydroxyvitamin D3, the active form of vitamin D3, which binds to the vitamin D receptor and regulates calcium metabolism. Thus this enzyme regulates the level of biologically active vitamin D and plays an important role in calcium homeostasis. Mutations in this gene can result in vitamin D-dependent rickets type I. | cytochrome P450 family 27 subfamily B member 1 | 1594 | NA |
| CA3-AS1 | ENSG00000253549 | NA | CA3 antisense RNA 1 | 100996348 | NA |
| BNIPL | ENSG00000163141 | The protein encoded by this gene interacts with several other proteins, such as BCL2, ARHGAP1, MIF and GFER. It may function as a bridge molecule between BCL2 and ARHGAP1/CDC42 in promoting cell death. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | BCL2/adenovirus E1B 19kD interacting protein like | 149428 | NA |
| KCNIP4 | ENSG00000185774 | This gene encodes a member of the family of voltage-gated potassium (Kv) channel-interacting proteins (KCNIPs), which belong to the recoverin branch of the EF-hand superfamily. Members of the KCNIP family are small calcium binding proteins. They all have EF-hand-like domains, and differ from each other in the N-terminus. They are integral subunit components of native Kv4 channel complexes. They may regulate A-type currents, and hence neuronal excitability, in response to changes in intracellular calcium. This protein member also interacts with presenilin. Multiple alternatively spliced transcript variants encoding distinct isoforms have been identified for this gene. | potassium voltage-gated channel interacting protein 4 | 80333 | NA |
| ANK2 | ENSG00000145362 | This gene encodes a member of the ankyrin family of proteins that link the integral membrane proteins to the underlying spectrin-actin cytoskeleton. Ankyrins play key roles in activities such as cell motility, activation, proliferation, contact and the maintenance of specialized membrane domains. Most ankyrins are typically composed of three structural domains: an amino-terminal domain containing multiple ankyrin repeats; a central region with a highly conserved spectrin binding domain; and a carboxy-terminal regulatory domain which is the least conserved and subject to variation. The protein encoded by this gene is required for targeting and stability of Na/Ca exchanger 1 in cardiomyocytes. Mutations in this gene cause long QT syndrome 4 and cardiac arrhythmia syndrome. Multiple transcript variants encoding different isoforms have been described. | ankyrin 2, neuronal | 287 | NA |
| HSPH1 | ENSG00000120694 | NA | heat shock protein family H (Hsp110) member 1 | 10808 | NA |
| PRSS3 | ENSG00000010438 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is expressed in the brain and pancreas and is resistant to common trypsin inhibitors. It is active on peptide linkages involving the carboxyl group of lysine or arginine. This gene is localized to the locus of T cell receptor beta variable orphans on chromosome 9. Four transcript variants encoding different isoforms have been described for this gene. | protease, serine 3 | 5646 | NA |
| LOC100129518 | ENSG00000112096 | NA | uncharacterized LOC100129518 | 100129518 | NA |
| SOD2 | ENSG00000112096 | This gene is a member of the iron/manganese superoxide dismutase family. It encodes a mitochondrial protein that forms a homotetramer and binds one manganese ion per subunit. This protein binds to the superoxide byproducts of oxidative phosphorylation and converts them to hydrogen peroxide and diatomic oxygen. Mutations in this gene have been associated with idiopathic cardiomyopathy (IDC), premature aging, sporadic motor neuron disease, and cancer. Alternative splicing of this gene results in multiple transcript variants. A related pseudogene has been identified on chromosome 1. | superoxide dismutase 2, mitochondrial | 6648 | NA |
| FAM171B | ENSG00000144369 | NA | family with sequence similarity 171 member B | 165215 | NA |
| MT1X | ENSG00000187193 | NA | metallothionein 1X | 4501 | NA |
| LIPE | ENSG00000079435 | The protein encoded by this gene has a long and a short form, generated by use of alternative translational start codons. The long form is expressed in steroidogenic tissues such as testis, where it converts cholesteryl esters to free cholesterol for steroid hormone production. The short form is expressed in adipose tissue, among others, where it hydrolyzes stored triglycerides to free fatty acids. | lipase E, hormone sensitive type | 3991 | NA |
| ACACB | ENSG00000076555 | Acetyl-CoA carboxylase (ACC) is a complex multifunctional enzyme system. ACC is a biotin-containing enzyme which catalyzes the carboxylation of acetyl-CoA to malonyl-CoA, the rate-limiting step in fatty acid synthesis. ACC-beta is thought to control fatty acid oxidation by means of the ability of malonyl-CoA to inhibit carnitine-palmitoyl-CoA transferase I, the rate-limiting step in fatty acid uptake and oxidation by mitochondria. ACC-beta may be involved in the regulation of fatty acid oxidation, rather than fatty acid biosynthesis. There is evidence for the presence of two ACC-beta isoforms. | acetyl-CoA carboxylase beta | 32 | NA |
| PCK1 | ENSG00000124253 | This gene is a main control point for the regulation of gluconeogenesis. The cytosolic enzyme encoded by this gene, along with GTP, catalyzes the formation of phosphoenolpyruvate from oxaloacetate, with the release of carbon dioxide and GDP. The expression of this gene can be regulated by insulin, glucocorticoids, glucagon, cAMP, and diet. Defects in this gene are a cause of cytosolic phosphoenolpyruvate carboxykinase deficiency. A mitochondrial isozyme of the encoded protein also has been characterized. | phosphoenolpyruvate carboxykinase 1 | 5105 | NA |
| DNM1 | ENSG00000106976 | This gene encodes a member of the dynamin subfamily of GTP-binding proteins. The encoded protein possesses unique mechanochemical properties used to tubulate and sever membranes, and is involved in clathrin-mediated endocytosis and other vesicular trafficking processes. Actin and other cytoskeletal proteins act as binding partners for the encoded protein, which can also self-assemble leading to stimulation of GTPase activity. More than sixty highly conserved copies of the 3’ region of this gene are found elsewhere in the genome, particularly on chromosomes Y and 15. Alternatively spliced transcript variants encoding different isoforms have been described. | dynamin 1 | 1759 | NA |
| PSD | ENSG00000059915 | This gene encodes a Plekstrin homology and SEC7 domains-containing protein that functions as a guanine nucleotide exchange factor. The encoded protein regulates signal transduction by activating ADP-ribosylation factor 6. Alternative splicing results in multiple transcript variants. | pleckstrin and Sec7 domain containing | 5662 | NA |
| CEND1 | ENSG00000184524 | The protein encoded by this gene is a neuron-specific protein. The similar protein in pig enhances neuroblastoma cell differentiation in vitro and may be involved in neuronal differentiation in vivo. Multiple pseudogenes have been reported for this gene. | cell cycle exit and neuronal differentiation 1 | 51286 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",4,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[5,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| name | query | symbol | summary | X_id | notfound |
|---|---|---|---|---|---|
| naked cuticle homolog 2 | ENSG00000145506 | NKD2 | This gene encodes a member of a family of proteins that function as negative regulators of Wnt receptor signaling through interaction with Dishevelled family members. The encoded protein participates in the delivery of transforming growth factor alpha-containing vesicles to the cell membrane. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 85409 | NA |
| apoptosis inducing factor, mitochondria associated 3 | ENSG00000183773 | AIFM3 | NA | 150209 | NA |
| pleckstrin and Sec7 domain containing | ENSG00000059915 | PSD | This gene encodes a Plekstrin homology and SEC7 domains-containing protein that functions as a guanine nucleotide exchange factor. The encoded protein regulates signal transduction by activating ADP-ribosylation factor 6. Alternative splicing results in multiple transcript variants. | 5662 | NA |
| Thy-1 cell surface antigen | ENSG00000154096 | THY1 | This gene encodes a cell surface glycoprotein and member of the immunoglobulin superfamily of proteins. The encoded protein is involved in cell adhesion and cell communication in numerous cell types, but particularly in cells of the immune and nervous systems. The encoded protein is widely used as a marker for hematopoietic stem cells. This gene may function as a tumor suppressor in nasopharyngeal carcinoma. Alternative splicing results in multiple transcript variants. | 7070 | NA |
| butyrylcholinesterase | ENSG00000114200 | BCHE | Mutant alleles at the BCHE locus are responsible for suxamethonium sensitivity. Homozygous persons sustain prolonged apnea after administration of the muscle relaxant suxamethonium in connection with surgical anesthesia. The activity of pseudocholinesterase in the serum is low and its substrate behavior is atypical. In the absence of the relaxant, the homozygote is at no known disadvantage. | 590 | NA |
| NA | ENSG00000263873 | RP11-334E6.12 | NA | ENSG00000263873 | NA |
| tachykinin receptor 2 | ENSG00000075073 | TACR2 | This gene belongs to a family of genes that function as receptors for tachykinins. Receptor affinities are specified by variations in the 5’-end of the sequence. The receptors belonging to this family are characterized by interactions with G proteins and 7 hydrophobic transmembrane regions. This gene encodes the receptor for the tachykinin neuropeptide substance K, also referred to as neurokinin A. | 6865 | NA |
| fucosyltransferase 2 | ENSG00000176920 | FUT2 | The protein encoded by this gene is a Golgi stack membrane protein that is involved in the creation of a precursor of the H antigen, which is required for the final step in the soluble A and B antigen synthesis pathway. This gene is one of two encoding the galactoside 2-L-fucosyltransferase enzyme. Two transcript variants encoding the same protein have been found for this gene. | 2524 | NA |
| S100 calcium binding protein A14 | ENSG00000189334 | S100A14 | This gene encodes a member of the S100 protein family which contains an EF-hand motif and binds calcium. The gene is located in a cluster of S100 genes on chromosome 1. Levels of the encoded protein have been found to be lower in cancerous tissue and associated with metastasis suggesting a tumor suppressor function (PMID: 19956863, 19351828). | 57402 | NA |
| peptidyl arginine deiminase 2 | ENSG00000117115 | PADI2 | This gene encodes a member of the peptidyl arginine deiminase family of enzymes, which catalyze the post-translational deimination of proteins by converting arginine residues into citrullines in the presence of calcium ions. The family members have distinct substrate specificities and tissue-specific expression patterns. The type II enzyme is the most widely expressed family member. Known substrates for this enzyme include myelin basic protein in the central nervous system and vimentin in skeletal muscle and macrophages. This enzyme is thought to play a role in the onset and progression of neurodegenerative human disorders, including Alzheimer disease and multiple sclerosis, and it has also been implicated in glaucoma pathogenesis. This gene exists in a cluster with four other paralogous genes. | 11240 | NA |
| NA | ENSG00000257499 | NA | NA | NA | TRUE |
| serine peptidase inhibitor, Kunitz type 1 | ENSG00000166145 | SPINT1 | The protein encoded by this gene is a member of the Kunitz family of serine protease inhibitors. The protein is a potent inhibitor specific for HGF activator and is thought to be involved in the regulation of the proteolytic activation of HGF in injured tissues. Alternative splicing results in multiple variants encoding different isoforms. | 6692 | NA |
| castor zinc finger 1 | ENSG00000130940 | CASZ1 | The protein encoded by this gene is a zinc finger transcription factor. The encoded protein may function as a tumor suppressor, and single nucleotide polymorphisms in this gene are associated with blood pressure variation. Alternative splicing results in multiple transcript variants that encode different protein isoforms. | 54897 | NA |
| ArfGAP with GTPase domain, ankyrin repeat and PH domain 2 | ENSG00000135439 | AGAP2 | The protein encoded by this gene belongs to the centaurin gamma-like family. It mediates anti-apoptotic effects of nerve growth factor by activating nuclear phosphoinositide 3-kinase. It is overexpressed in cancer cells, and promotes cancer cell invasion. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | 116986 | NA |
| synemin | ENSG00000182253 | SYNM | The protein encoded by this gene is an intermediate filament (IF) family member. IF proteins are cytoskeletal proteins that confer resistance to mechanical stress and are encoded by a dispersed multigene family. This protein has been found to form a linkage between desmin, which is a subunit of the IF network, and the extracellular matrix, and provides an important structural support in muscle. Two alternatively spliced variants encoding different isoforms have been described for this gene. | 23336 | NA |
| ANO1 antisense RNA 1 | ENSG00000254902 | ANO1-AS1 | NA | ENSG00000254902 | NA |
| family with sequence similarity 46 member B | ENSG00000158246 | FAM46B | NA | 115572 | NA |
| galactosidase beta 1 like 2 | ENSG00000149328 | GLB1L2 | NA | 89944 | NA |
| cytochrome P450 family 4 subfamily F member 29, pseudogene | ENSG00000228314 | CYP4F29P | NA | 54055 | NA |
| protein tyrosine kinase 6 | ENSG00000101213 | PTK6 | The protein encoded by this gene is a cytoplasmic nonreceptor protein kinase which may function as an intracellular signal transducer in epithelial tissues. Overexpression of this gene in mammary epithelial cells leads to sensitization of the cells to epidermal growth factor and results in a partially transformed phenotype. Expression of this gene has been detected at low levels in some breast tumors but not in normal breast tissue. The encoded protein has been shown to undergo autophosphorylation. Alternative splicing results in multiple transcript variants. | 5753 | NA |
| circadian associated repressor of transcription | ENSG00000159208 | CIART | NA | 148523 | NA |
| calponin 1 | ENSG00000130176 | CNN1 | NA | 1264 | NA |
| RAR related orphan receptor A | ENSG00000069667 | RORA | The protein encoded by this gene is a member of the NR1 subfamily of nuclear hormone receptors. It can bind as a monomer or as a homodimer to hormone response elements upstream of several genes to enhance the expression of those genes. The encoded protein has been shown to interact with NM23-2, a nucleoside diphosphate kinase involved in organogenesis and differentiation, as well as with NM23-1, the product of a tumor metastasis suppressor candidate gene. Also, it has been shown to aid in the transcriptional regulation of some genes involved in circadian rhythm. Four transcript variants encoding different isoforms have been described for this gene. | 6095 | NA |
| transmembrane protein 52 | ENSG00000178821 | TMEM52 | NA | 339456 | NA |
| early growth response 3 | ENSG00000179388 | EGR3 | This gene encodes a transcriptional regulator that belongs to the EGR family of C2H2-type zinc-finger proteins. It is an immediate-early growth response gene which is induced by mitogenic stimulation. The protein encoded by this gene participates in the transcriptional regulation of genes in controling biological rhythm. It may also play a role in a wide variety of processes including muscle development, lymphocyte development, endothelial cell growth and migration, and neuronal development. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | 1960 | NA |
| sorbin and SH3 domain containing 1 | ENSG00000095637 | SORBS1 | This gene encodes a CBL-associated protein which functions in the signaling and stimulation of insulin. Mutations in this gene may be associated with human disorders of insulin resistance. Alternative splicing results in multiple transcript variants. | 10580 | NA |
| prominin 2 | ENSG00000155066 | PROM2 | This gene encodes a member of the prominin family of pentaspan membrane glycoproteins. The encoded protein localizes to basal epithelial cells and may be involved in the organization of plasma membrane microdomains. Alternative splicing results in multiple transcript variants. | 150696 | NA |
| angiomotin like 1 | ENSG00000166025 | AMOTL1 | The protein encoded by this gene is a peripheral membrane protein that is a component of tight junctions or TJs. TJs form an apical junctional structure and act to control paracellular permeability and maintain cell polarity. This protein is related to angiomotin, an angiostatin binding protein that regulates endothelial cell migration and capillary formation. Two transcript variants encoding different isoforms have been found for this gene. | 154810 | NA |
| dual oxidase 1 | ENSG00000137857 | DUOX1 | The protein encoded by this gene is a glycoprotein and a member of the NADPH oxidase family. The synthesis of thyroid hormone is catalyzed by a protein complex located at the apical membrane of thyroid follicular cells. This complex contains an iodide transporter, thyroperoxidase, and a peroxide generating system that includes proteins encoded by this gene and the similar DUOX2 gene. This protein is known as dual oxidase because it has both a peroxidase homology domain and a gp91phox domain. This protein generates hydrogen peroxide and thereby plays a role in the activity of thyroid peroxidase, lactoperoxidase, and in lactoperoxidase-mediated antimicrobial defense at mucosal surfaces. Two alternatively spliced transcript variants encoding the same protein have been described for this gene. | 53905 | NA |
| NA | ENSG00000261762 | RP11-650L12.2 | NA | ENSG00000261762 | NA |
| A-kinase anchoring protein 1 | ENSG00000121057 | AKAP1 | The A-kinase anchor proteins (AKAPs) are a group of structurally diverse proteins, which have the common function of binding to the regulatory subunit of protein kinase A (PKA) and confining the holoenzyme to discrete locations within the cell. This gene encodes a member of the AKAP family. The encoded protein binds to type I and type II regulatory subunits of PKA and anchors them to the mitochondrion. This protein is speculated to be involved in the cAMP-dependent signal transduction pathway and in directing RNA to a specific cellular compartment. | 8165 | NA |
| tryptase alpha/beta 1 | ENSG00000172236 | TPSAB1 | Tryptases comprise a family of trypsin-like serine proteases, the peptidase family S1. Tryptases are enzymatically active only as heparin-stabilized tetramers, and they are resistant to all known endogenous proteinase inhibitors. Several tryptase genes are clustered on chromosome 16p13.3. These genes are characterized by several distinct features. They have a highly conserved 3’ UTR and contain tandem repeat sequences at the 5’ flank and 3’ UTR which are thought to play a role in regulation of the mRNA stability. These genes have an intron immediately upstream of the initiator Met codon, which separates the site of transcription initiation from protein coding sequence. This feature is characteristic of tryptases but is unusual in other genes. The alleles of this gene exhibit an unusual amount of sequence variation, such that the alleles were once thought to represent two separate genes, alpha and beta 1. Beta tryptases appear to be the main isoenzymes expressed in mast cells; whereas in basophils, alpha tryptases predominate. Tryptases have been implicated as mediators in the pathogenesis of asthma and other allergic and inflammatory disorders. | 7177 | NA |
| NA | ENSG00000267940 | RP11-290F24.6 | NA | ENSG00000267940 | NA |
| complexin 1 | ENSG00000168993 | CPLX1 | Proteins encoded by the complexin/synaphin gene family are cytosolic proteins that function in synaptic vesicle exocytosis. These proteins bind syntaxin, part of the SNAP receptor. The protein product of this gene binds to the SNAP receptor complex and disrupts it, allowing transmitter release. | 10815 | NA |
| WAP four-disulfide core domain 3 | ENSG00000124116 | WFDC3 | This gene encodes a member of the WAP-type four-disulfide core (WFDC) domain family. The WFDC domain, or WAP signature motif, contains eight cysteines forming four disulfide bonds at the core of the protein, and functions as a protease inhibitor. The encoded protein contains four WFDC domains. Most WFDC genes are localized to chromosome 20q12-q13 in two clusters: centromeric and telomeric. This gene belongs to the telomeric cluster. Alternatively spliced transcript variants have been observed but their full-length nature has not been determined. | 140686 | NA |
| NA | ENSG00000261054 | RP11-6O2.4 | NA | ENSG00000261054 | NA |
| coiled-coil domain containing 85C | ENSG00000205476 | CCDC85C | NA | 317762 | NA |
| NA | ENSG00000256469 | RP11-856F16.2 | NA | ENSG00000256469 | NA |
| NA | ENSG00000213144 | RP11-64B16.2 | NA | ENSG00000213144 | NA |
| transglutaminase 1 | ENSG00000092295 | TGM1 | The protein encoded by this gene is a membrane protein that catalyzes the addition of an alkyl group from an akylamine to a glutamine residue of a protein, forming an alkylglutamine in the protein. This protein alkylation leads to crosslinking of proteins and catenation of polyamines to proteins. This gene contains either one or two copies of a 22 nt repeat unit in its 3’ UTR. Mutations in this gene have been associated with autosomal recessive lamellar ichthyosis (LI) and nonbullous congenital ichthyosiform erythroderma (NCIE). | 7051 | NA |
| BCL2 associated athanogene 3 | ENSG00000151929 | BAG3 | BAG proteins compete with Hip for binding to the Hsc70/Hsp70 ATPase domain and promote substrate release. All the BAG proteins have an approximately 45-amino acid BAG domain near the C terminus but differ markedly in their N-terminal regions. The protein encoded by this gene contains a WW domain in the N-terminal region and a BAG domain in the C-terminal region. The BAG domains of BAG1, BAG2, and BAG3 interact specifically with the Hsc70 ATPase domain in vitro and in mammalian cells. All 3 proteins bind with high affinity to the ATPase domain of Hsc70 and inhibit its chaperone activity in a Hip-repressible manner. | 9531 | NA |
| podoplanin | ENSG00000162493 | PDPN | This gene encodes a type-I integral membrane glycoprotein with diverse distribution in human tissues. The physiological function of this protein may be related to its mucin-type character. The homologous protein in other species has been described as a differentiation antigen and influenza-virus receptor. The specific function of this protein has not been determined but it has been proposed as a marker of lung injury. Alternatively spliced transcript variants encoding different isoforms have been identified. | 10630 | NA |
| myosin light chain kinase | ENSG00000065534 | MYLK | This gene, a muscle member of the immunoglobulin gene superfamily, encodes myosin light chain kinase which is a calcium/calmodulin dependent enzyme. This kinase phosphorylates myosin regulatory light chains to facilitate myosin interaction with actin filaments to produce contractile activity. This gene encodes both smooth muscle and nonmuscle isoforms. In addition, using a separate promoter in an intron in the 3’ region, it encodes telokin, a small protein identical in sequence to the C-terminus of myosin light chain kinase, that is independently expressed in smooth muscle and functions to stabilize unphosphorylated myosin filaments. A pseudogene is located on the p arm of chromosome 3. Four transcript variants that produce four isoforms of the calcium/calmodulin dependent enzyme have been identified as well as two transcripts that produce two isoforms of telokin. Additional variants have been identified but lack full length transcripts. | 4638 | NA |
| NA | ENSG00000261616 | RP11-6O2.3 | NA | ENSG00000261616 | NA |
| heat shock protein family B (small) member 8 | ENSG00000152137 | HSPB8 | The protein encoded by this gene belongs to the superfamily of small heat-shock proteins containing a conservative alpha-crystallin domain at the C-terminal part of the molecule. The expression of this gene in induced by estrogen in estrogen receptor-positive breast cancer cells, and this protein also functions as a chaperone in association with Bag3, a stimulator of macroautophagy. Thus, this gene appears to be involved in regulation of cell proliferation, apoptosis, and carcinogenesis, and mutations in this gene have been associated with different neuromuscular diseases, including Charcot-Marie-Tooth disease. | 26353 | NA |
| MICAL like 1 | ENSG00000100139 | MICALL1 | NA | 85377 | NA |
| coiled-coil domain containing 181 | ENSG00000117477 | CCDC181 | NA | 57821 | NA |
| heat shock protein family A (Hsp70) member 2 | ENSG00000126803 | HSPA2 | NA | 3306 | NA |
| retinoic acid receptor gamma | ENSG00000172819 | RARG | This gene encodes a retinoic acid receptor that belongs to the nuclear hormone receptor family. Retinoic acid receptors (RARs) act as ligand-dependent transcriptional regulators. When bound to ligands, RARs activate transcription by binding as heterodimers to the retinoic acid response elements (RARE) found in the promoter regions of the target genes. In their unbound form, RARs repress transcription of their target genes. RARs are involved in various biological processes, including limb bud development, skeletal growth, and matrix homeostasis. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 5916 | NA |
| transmembrane protein 132A | ENSG00000006118 | TMEM132A | This gene encodes a protein that is highly similar to the rat Grp78-binding protein (GBP). Alternatively spliced transcript variants encoding different isoforms have been described. | 54972 | NA |
| mucin like 1 | ENSG00000172551 | MUCL1 | NA | 118430 | NA |
| solute carrier family 25 member 25 | ENSG00000148339 | SLC25A25 | The protein encoded by this gene belongs to the family of calcium-binding mitochondrial carriers, with a characteristic mitochondrial carrier domain at the C-terminus. These proteins are found in the inner membranes of mitochondria, and function as transport proteins. They shuttle metabolites, nucleotides and cofactors through the mitochondrial membrane and thereby connect and/or regulate cytoplasm and matrix functions. This protein may function as an ATP-Mg/Pi carrier that mediates the transport of Mg-ATP in exchange for phosphate, and likely responsible for the net uptake or efflux of adenine nucleotides into or from the mitochondria. Alternatively spliced transcript variants encoding different isoforms with a common C-terminus but variable N-termini have been described for this gene. | 114789 | NA |
| creatine kinase B | ENSG00000166165 | CKB | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in brain as well as in other tissues, and as a heterodimer with a similar muscle isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. A pseudogene of this gene has been characterized. | 1152 | NA |
| carnitine palmitoyltransferase 1C | ENSG00000169169 | CPT1C | This gene encodes a member of the carnitine/choline acetyltransferase family. The encoded protein regulates the beta-oxidation and transport of long-chain fatty acids into mitochondria, and may play a role in the regulation of feeding behavior and whole-body energy homeostasis. Alternatively spliced transcript variants encoding multiple protein isoforms have been observed for this gene. | 126129 | NA |
| ankyrin repeat domain 9 | ENSG00000156381 | ANKRD9 | NA | 122416 | NA |
| NA | ENSG00000272986 | RP11-46J23.1 | NA | ENSG00000272986 | NA |
| uncharacterized LOC101929777 | ENSG00000108379 | LOC101929777 | NA | 101929777 | NA |
| Wnt family member 3 | ENSG00000108379 | WNT3 | The WNT gene family consists of structurally related genes which encode secreted signaling proteins. These proteins have been implicated in oncogenesis and in several developmental processes, including regulation of cell fate and patterning during embryogenesis. This gene is a member of the WNT gene family. It encodes a protein which shows 98% amino acid identity to mouse Wnt3 protein, and 84% to human WNT3A protein, another WNT gene product. The mouse studies show the requirement of Wnt3 in primary axis formation in the mouse. Studies of the gene expression suggest that this gene may play a key role in some cases of human breast, rectal, lung, and gastric cancer through activation of the WNT-beta-catenin-TCF signaling pathway. This gene is clustered with WNT15, another family member, in the chromosome 17q21 region. | 7473 | NA |
| NA | ENSG00000272084 | RP5-1126H10.2 | NA | ENSG00000272084 | NA |
| adenylate kinase 7 | ENSG00000140057 | AK7 | NA | 122481 | NA |
| TEA domain transcription factor 3 | ENSG00000007866 | TEAD3 | This gene product is a member of the transcriptional enhancer factor (TEF) family of transcription factors, which contain the TEA/ATTS DNA-binding domain. It is predominantly expressed in the placenta and is involved in the transactivation of the chorionic somatomammotropin-B gene enhancer. Translation of this protein is initiated at a non-AUG (AUA) start codon. | 7005 | NA |
| TNF alpha induced protein 8 like 1 | ENSG00000185361 | TNFAIP8L1 | NA | 126282 | NA |
| cytochrome P450 family 3 subfamily A member 5 | ENSG00000106258 | CYP3A5 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. The encoded protein metabolizes drugs as well as the steroid hormones testosterone and progesterone. This gene is part of a cluster of cytochrome P450 genes on chromosome 7q21.1. Two pseudogenes of this gene have been identified within this cluster on chromosome 7. Expression of this gene is widely variable among populations, and a single nucleotide polymorphism that affects transcript splicing has been associated with susceptibility to hypertensions. Alternative splicing results in multiple transcript variants. | 1577 | NA |
| 5’-nucleotidase domain containing 3 | ENSG00000111696 | NT5DC3 | NA | 51559 | NA |
| neurexophilin 3 | ENSG00000182575 | NXPH3 | NA | 11248 | NA |
| PGAM family member 5, mitochondrial serine/threonine protein phosphatase | ENSG00000247077 | PGAM5 | NA | 192111 | NA |
| V-set and immunoglobulin domain containing 2 | ENSG00000019102 | VSIG2 | NA | 23584 | NA |
| NA | ENSG00000262877 | RP11-1055B8.4 | NA | ENSG00000262877 | NA |
| tuftelin 1 | ENSG00000143367 | TUFT1 | Tuftelin is an acidic protein that is thought to play a role in dental enamel mineralization and is implicated in caries susceptibility. It is also thought to be involved with adaptation to hypoxia, mesenchymal stem cell function, and neurotrophin nerve growth factor mediated neuronal differentiation. | 7286 | NA |
| malignant fibrous histiocytoma amplified sequence 1 | ENSG00000147324 | MFHAS1 | Identified in a human 8p amplicon, this gene is a potential oncogene whose expression is enhanced in some malignant fibrous histiocytomas (MFH). The primary structure of its product includes an ATP/GTP-binding site, three leucine zipper domains, and a leucine-rich tandem repeat, which are structural or functional elements for interactions among proteins related to the cell cycle, and which suggest that overexpression might be oncogenic with respect to MFH. | 9258 | NA |
| transmembrane protein 79 | ENSG00000163472 | TMEM79 | NA | 84283 | NA |
| DnaJ heat shock protein family (Hsp40) member B5 | ENSG00000137094 | DNAJB5 | DNAJB5 belongs to the evolutionarily conserved DNAJ/HSP40 protein family. For background information on the DNAJ family, see MIM 608375. | 25822 | NA |
| NA | ENSG00000260911 | RP11-196G11.2 | NA | ENSG00000260911 | NA |
| colony stimulating factor 3 | ENSG00000108342 | CSF3 | The protein encoded by this gene is a cytokine that controls the production, differentiation, and function of granulocytes. The active protein is found extracellularly. Alternatively spliced transcript variants have been described for this gene. | 1440 | NA |
| tetraspanin 5 | ENSG00000168785 | TSPAN5 | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. | 10098 | NA |
| NA | ENSG00000263335 | AF001548.5 | NA | ENSG00000263335 | NA |
| interleukin 34 | ENSG00000157368 | IL34 | Interleukin-34 is a cytokine that promotes the differentiation and viability of monocytes and macrophages through the colony-stimulating factor-1 receptor (CSF1R; MIM 164770) (Lin et al., 2008 [PubMed 18467591]). | 146433 | NA |
| cholinergic receptor nicotinic alpha 5 subunit | ENSG00000169684 | CHRNA5 | The protein encoded by this gene is a nicotinic acetylcholine receptor subunit and a member of a superfamily of ligand-gated ion channels that mediate fast signal transmission at synapses. These receptors are thought to be heteropentamers composed of separate but similar subunits. Defects in this gene have been linked to susceptibility to lung cancer type 2 (LNCR2). | 1138 | NA |
| major facilitator superfamily domain containing 2A | ENSG00000168389 | MFSD2A | NA | 84879 | NA |
| SERPINE1 mRNA binding protein 1 pseudogene 3 | ENSG00000242142 | SERBP1P3 | NA | ENSG00000242142 | NA |
| NA | ENSG00000182319 | NA | NA | NA | TRUE |
| hes family bHLH transcription factor 6 | ENSG00000144485 | HES6 | This gene encodes a member of a subfamily of basic helix-loop-helix transcription repressors that have homology to the Drosophila enhancer of split genes. Members of this gene family regulate cell differentiation in numerous cell types. The protein encoded by this gene functions as a cofactor, interacting with other transcription factors through a tetrapeptide domain in its C-terminus. Alternatively spliced transcript variants encoding different isoforms have been described. | 55502 | NA |
| repulsive guidance molecule family member b | ENSG00000174136 | RGMB | RGMB is a glycosylphosphatidylinositol (GPI)-anchored member of the repulsive guidance molecule family (see RGMA, MIM 607362) and contributes to the patterning of the developing nervous system (Samad et al., 2005 [PubMed 15671031]). | 285704 | NA |
| NA | ENSG00000270605 | RP5-1092A3.4 | NA | ENSG00000270605 | NA |
| synuclein alpha interacting protein | ENSG00000064692 | SNCAIP | This gene encodes a protein containing several protein-protein interaction domains, including ankyrin-like repeats, a coiled-coil domain, and an ATP/GTP-binding motif. The encoded protein interacts with alpha-synuclein in neuronal tissue and may play a role in the formation of cytoplasmic inclusions and neurodegeneration. A mutation in this gene has been associated with Parkinson’s disease. Alternative splicing results in multiple transcript variants. | 9627 | NA |
| solute carrier family 45 member 3 | ENSG00000158715 | SLC45A3 | NA | 85414 | NA |
| prostaglandin E synthase 3 (cytosolic)-like | ENSG00000267060 | PTGES3L | NA | 100885848 | NA |
| solute carrier family 7 member 5 | ENSG00000103257 | SLC7A5 | NA | 8140 | NA |
| CD200 receptor 1 | ENSG00000163606 | CD200R1 | This gene encodes a receptor for the OX-2 membrane glycoprotein. Both the receptor and substrate are cell surface glycoproteins containing two immunoglobulin-like domains. This receptor is restricted to the surfaces of myeloid lineage cells and the receptor-substrate interaction may function as a myeloid downregulatory signal. Mouse studies of a related gene suggest that this interaction may control myeloid function in a tissue-specific manner. Alternative splicing of this gene results in multiple transcript variants. | 131450 | NA |
| macrophage stimulating 1-like | ENSG00000186715 | MST1L | NA | ENSG00000186715 | NA |
| NA | ENSG00000260466 | RP4-536B24.2 | NA | ENSG00000260466 | NA |
| LSM11, U7 small nuclear RNA associated | ENSG00000155858 | LSM11 | NA | 134353 | NA |
| plasminogen-like B1 | ENSG00000183281 | PLGLB1 | NA | 5343 | NA |
| oligodendrocyte myelin glycoprotein | ENSG00000126861 | OMG | NA | 4974 | NA |
| uncharacterized LOC102723927 | ENSG00000261186 | LOC102723927 | NA | 102723927 | NA |
| ribosomal protein S20 pseudogene 21 | ENSG00000244295 | RPS20P21 | NA | ENSG00000244295 | NA |
| natural killer cell cytotoxicity receptor 3 ligand 1 | ENSG00000188211 | NCR3LG1 | B7H6 belongs to the B7 family (see MIM 605402) and is selectively expressed on tumor cells. Interaction of B7H6 with NKp30 (NCR3; MIM 611550) results in natural killer (NK) cell activation and cytotoxicity (Brandt et al., 2009 [PubMed 19528259]). | 374383 | NA |
| calpain 5 | ENSG00000149260 | CAPN5 | Calpains are calcium-dependent cysteine proteases involved in signal transduction in a variety of cellular processes. A functional calpain protein consists of an invariant small subunit and 1 of a family of large subunits. CAPN5 is one of the large subunits. Unlike some of the calpains, CAPN5 and CAPN6 lack a calmodulin-like domain IV. Because of the significant similarity to Caenorhabditis elegans sex determination gene tra-3, CAPN5 is also called as HTRA3. | 726 | NA |
| keratin 19 | ENSG00000171345 | KRT19 | The protein encoded by this gene is a member of the keratin family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. The type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. Unlike its related family members, this smallest known acidic cytokeratin is not paired with a basic cytokeratin in epithelial cells. It is specifically expressed in the periderm, the transiently superficial layer that envelopes the developing epidermis. The type I cytokeratins are clustered in a region of chromosome 17q12-q21. | 3880 | NA |
| period circadian clock 2 | ENSG00000132326 | PER2 | This gene is a member of the Period family of genes and is expressed in a circadian pattern in the suprachiasmatic nucleus, the primary circadian pacemaker in the mammalian brain. Genes in this family encode components of the circadian rhythms of locomotor activity, metabolism, and behavior. This gene is upregulated by CLOCK/ARNTL heterodimers but then represses this upregulation in a feedback loop using PER/CRY heterodimers to interact with CLOCK/ARNTL. Polymorphisms in this gene may increase the risk of getting certain cancers and have been linked to sleep disorders. | 8864 | NA |
| NA | ENSG00000267194 | RP1-193H18.2 | NA | ENSG00000267194 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",5,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[6,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| summary | X_id | query | symbol | name | notfound |
|---|---|---|---|---|---|
| The protein encoded by this gene is a cytokine that controls the production, differentiation, and function of granulocytes. The active protein is found extracellularly. Alternatively spliced transcript variants have been described for this gene. | 1440 | ENSG00000108342 | CSF3 | colony stimulating factor 3 | NA |
| Activins are dimeric growth and differentiation factors which belong to the transforming growth factor-beta (TGF-beta) superfamily of structurally related signaling proteins. Activins signal through a heteromeric complex of receptor serine kinases which include at least two type I ( I and IB) and two type II (II and IIB) receptors. These receptors are all transmembrane proteins, composed of a ligand-binding extracellular domain with cysteine-rich region, a transmembrane domain, and a cytoplasmic domain with predicted serine/threonine specificity. Type I receptors are essential for signaling; and type II receptors are required for binding ligands and for expression of type I receptors. Type I and II receptors form a stable complex after ligand binding, resulting in phosphorylation of type I receptors by type II receptors. This gene encodes activin A type I receptor which signals a particular transcriptional response in concert with activin type II receptors. Mutations in this gene are associated with fibrodysplasia ossificans progressive. | 90 | ENSG00000115170 | ACVR1 | activin A receptor type 1 | NA |
| The full-length protein encoded by this gene is an intracellular tetrapyrrole-binding protein. This protein includes a natural chemoattractant peptide of 21 amino acids at the N-terminus, which is a natural ligand for formyl peptide receptor-like receptor 2 (FPRL2) and promotes calcium mobilization and chemotaxis in monocytes and dendritic cells. | 50865 | ENSG00000013583 | HEBP1 | heme binding protein 1 | NA |
| This gene encodes the insulin receptor substrate 2, a cytoplasmic signaling molecule that mediates effects of insulin, insulin-like growth factor 1, and other cytokines by acting as a molecular adaptor between diverse receptor tyrosine kinases and downstream effectors. The product of this gene is phosphorylated by the insulin receptor tyrosine kinase upon receptor stimulation, as well as by an interleukin 4 receptor-associated kinase in response to IL4 treatment. | 8660 | ENSG00000185950 | IRS2 | insulin receptor substrate 2 | NA |
| This gene encodes a member of the low-density lipoprotein receptor family of proteins. The encoded preproprotein is proteolytically processed by furin to generate 515 kDa and 85 kDa subunits that form the mature receptor (PMID: 8546712). This receptor is involved in several cellular processes, including intracellular signaling, lipid homeostasis, and clearance of apoptotic cells. In addition, the encoded protein is necessary for the alpha 2-macroglobulin-mediated clearance of secreted amyloid precursor protein and beta-amyloid, the main component of amyloid plaques found in Alzheimer patients. Expression of this gene decreases with age and has been found to be lower than controls in brain tissue from Alzheimer’s disease patients. | 4035 | ENSG00000123384 | LRP1 | LDL receptor related protein 1 | NA |
| NA | NA | ENSG00000255813 | NA | NA | TRUE |
| This gene is thought to play an important role in calcium homeostasis. The gene is expressed from two promoters and undergoes extensive alternative splicing. The encoded set of proteins share varying amounts of overlap near their N-termini but have substantial variations in their C-terminal domains resulting in distinct functional properties. The longest isoforms (a and f) include a C-terminal Aspartyl/Asparaginyl beta-hydroxylase domain that hydroxylates aspartic acid or asparagine residues in the epidermal growth factor (EGF)-like domains of some proteins, including protein C, coagulation factors VII, IX, and X, and the complement factors C1R and C1S. Other isoforms differ primarily in the C-terminal sequence and lack the hydroxylase domain, and some have been localized to the endoplasmic and sarcoplasmic reticulum. Some of these isoforms are found in complexes with calsequestrin, triadin, and the ryanodine receptor, and have been shown to regulate calcium release from the sarcoplasmic reticulum. Some isoforms have been implicated in metastasis. | 444 | ENSG00000198363 | ASPH | aspartate beta-hydroxylase | NA |
| The protein encoded by this gene is a member of the ankyrin repeat and SOCS box-containing (ASB) family of proteins. They contain ankyrin repeat sequence and a SOCS box domain. The SOCS box serves to couple suppressor of cytokine signalling (SOCS) proteins and their binding partners with the elongin B and C complex, possibly targeting them for degradation. Multiple alternatively spliced transcript variants, both protein-coding and not protein-coding, have been described for this gene. | 79754 | ENSG00000196372 | ASB13 | ankyrin repeat and SOCS box containing 13 | NA |
| The RAB5 protein is a small GTPase involved in membrane trafficking in the early endocytic pathway. The protein encoded by this gene binds the GTP-bound form of the RAB5 protein preferentially over the GDP-bound form, and functions as a guanine nucleotide exchange factor for RAB5. The encoded protein is found primarily as a tetramer in the cytoplasm and does not bind other members of the RAB family. Mutations in this gene cause macrocephaly alopecia cutis laxa and scoliosis (MACS) syndrome, an elastic tissue disorder, as well as the related connective tissue disorder, RIN2 syndrome. Alternative splicing results in multiple transcript variants. | 54453 | ENSG00000132669 | RIN2 | Ras and Rab interactor 2 | NA |
| NA | ENSG00000233547 | ENSG00000233547 | RP11-57H14.2 | NA | NA |
| NA | 83699 | ENSG00000198478 | SH3BGRL2 | SH3 domain binding glutamate rich protein like 2 | NA |
| This gene encodes a member of the thioredoxin family of enzymes. It is a cytosolic and ubiquitously expressed flavoprotein that catalyzes the two-electron reduction of quinone substrates and uses dihydronicotinamide riboside as a reducing coenzyme. Mutations in this gene have been associated with neurodegenerative diseases and several cancers. Alternative splicing results in multiple transcript variants. | 4835 | ENSG00000124588 | NQO2 | NAD(P)H quinone dehydrogenase 2 | NA |
| NA | 115548 | ENSG00000157107 | FCHO2 | FCH domain only 2 | NA |
| NA | 9788 | ENSG00000170873 | MTSS1 | metastasis suppressor 1 | NA |
| The protein encoded by this gene contains a RING finger motif and is similar to g1, a Drosophila zinc-finger protein that is expressed in mesoderm and involved in embryonic development. The expression of the mouse counterpart was found to be upregulated in myeloblastic cells following IL3 deprivation, suggesting that this gene may regulate growth factor withdrawal-induced apoptosis of myeloid precursor cells. Alternative splicing results in multiple transcript variants. | 55819 | ENSG00000113269 | RNF130 | ring finger protein 130 | NA |
| The protein encoded by this gene belongs to a small group of evolutionarily conserved proteins with three transmembrane domains. It is a potential target for ubiquitination by the Nedd4 family of proteins. This protein is thought to be part of a family of integral Golgi membrane proteins. | 80762 | ENSG00000131507 | NDFIP1 | Nedd4 family interacting protein 1 | NA |
| This gene is a member of the MAD gene family . The MAD genes encode basic helix-loop-helix-leucine zipper proteins that heterodimerize with MAX protein, forming a transcriptional repression complex. The MAD proteins compete for MAX binding with MYC, which heterodimerizes with MAX forming a transcriptional activation complex. Studies in rodents suggest that the MAD genes are tumor suppressors and contribute to the regulation of cell growth in differentiating tissues. | 10608 | ENSG00000123933 | MXD4 | MAX dimerization protein 4 | NA |
| NA | ENSG00000263640 | ENSG00000263640 | AF235103.1 | NA | NA |
| NA | 146547 | ENSG00000178226 | PRSS36 | protease, serine 36 | NA |
| NA | ENSG00000267543 | ENSG00000267543 | RP11-666A8.7 | NA | NA |
| This gene encodes a threonine synthase-like protein. A similar enzyme in mouse can catalyze the degradation of O-phospho-homoserine to a-ketobutyrate, phosphate, and ammonia. This protein also has phospho-lyase activity on both gamma and beta phosphorylated substrates. In mouse an alternatively spliced form of this protein has been shown to act as a cytokine and can induce the production of the inflammatory cytokine IL6 in osteoblasts. Alternate splicing results in multiple transcript variants. | 55258 | ENSG00000144115 | THNSL2 | threonine synthase like 2 | NA |
| The protein encoded by this gene is essential for bone resorption, and may play a critical role in vesicular transport in the osteoclast. Mutations in this gene are associated with autosomal recessive osteopetrosis type 6 (OPTB6). Alternatively spliced transcript variants have been found for this gene. | 9842 | ENSG00000225190 | PLEKHM1 | pleckstrin homology and RUN domain containing M1 | NA |
| NA | ENSG00000260306 | ENSG00000260306 | RP11-645C24.5 | NA | NA |
| NA | NA | ENSG00000264043 | NA | NA | TRUE |
| This gene encodes a member of the growth arrest-specific 2 protein family. This protein binds components of the cytoskeleton and may be involved in mediating interactions between microtubules and microfilaments. Alternate splicing results in multiple transcript variants. A pseudogene of this gene is found on chromosome 9. | 10634 | ENSG00000185340 | GAS2L1 | growth arrest specific 2 like 1 | NA |
| This gene is an ortholog of the C. elegans unc-76 gene, which is necessary for normal axonal bundling and elongation within axon bundles. Other orthologs include the rat gene that encodes zygin II, which can bind to synaptotagmin. | 9637 | ENSG00000171055 | FEZ2 | fasciculation and elongation protein zeta 2 | NA |
| NA | 100507103 | ENSG00000230537 | LOC100507103 | uncharacterized LOC100507103 | NA |
| This gene encodes a member of the C1 family of peptidases. Alternative splicing of this gene results in multiple transcript variants. At least one of these variants encodes a preproprotein that is proteolytically processed to generate multiple protein products. These products include the cathepsin B light and heavy chains, which can dimerize to form the double chain form of the enzyme. This enzyme is a lysosomal cysteine protease with both endopeptidase and exopeptidase activity that may play a role in protein turnover. It is also known as amyloid precursor protein secretase and is involved in the proteolytic processing of amyloid precursor protein (APP). Incomplete proteolytic processing of APP has been suggested to be a causative factor in Alzheimer’s disease, the most common cause of dementia. Overexpression of the encoded protein has been associated with esophageal adenocarcinoma and other tumors. Multiple pseudogenes of this gene have been identified. | 1508 | ENSG00000164733 | CTSB | cathepsin B | NA |
| This gene encodes a member of the pyruvate dehydrogenase kinase family. The encoded protein phosphorylates pyruvate dehydrogenase, down-regulating the activity of the mitochondrial pyruvate dehydrogenase complex. Overexpression of this gene may play a role in both cancer and diabetes. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 5164 | ENSG00000005882 | PDK2 | pyruvate dehydrogenase kinase 2 | NA |
| This locus encodes a guanine nucleotide-binding protein. The encoded protein, an alpha subunit in the Gq class, couples a seven-transmembrane domain receptor to activation of phospolipase C-beta. Mutations at this locus have been associated with problems in platelet activation and aggregation. A related pseudogene exists on chromosome 2. | 2776 | ENSG00000156052 | GNAQ | G protein subunit alpha q | NA |
| This gene encodes a lysosomal protein that interacts with RAB7, a small GTPase that controls transport to endocytic degradative compartments. Studies using mutant forms of the two proteins suggest that this protein represents a downstream effector for RAB7, and both proteins act together in the regulation of late endocytic traffic. A unique region of this protein has also been shown to be involved in the regulation of lysosomal morphology. | 83547 | ENSG00000167705 | RILP | Rab interacting lysosomal protein | NA |
| This gene encodes a pseudophosphatase and member of the myotubularin-related protein family. This gene maps within the CMT4B2 candidate region of chromosome 11p15 and mutations in this gene have been associated with Charcot-Marie-Tooth Disease, type 4B2. | 81846 | ENSG00000133812 | SBF2 | SET binding factor 2 | NA |
| The protein encoded by this gene is a cytosolic protein which contains a phosphotyrosine binding (PTD) domain. The PTD domain has been found to interact with the cytoplasmic tail of the LDL receptor. Mutations in this gene lead to LDL receptor malfunction and cause the disorder autosomal recessive hypercholesterolaemia. | 26119 | ENSG00000157978 | LDLRAP1 | low density lipoprotein receptor adaptor protein 1 | NA |
| NA | 23129 | ENSG00000004399 | PLXND1 | plexin D1 | NA |
| NA | ENSG00000261269 | ENSG00000261269 | RP11-389C8.2 | NA | NA |
| This gene encodes a member of the trypsin family of serine proteases. This protein is a secreted enzyme that is proposed to regulate the availability of insulin-like growth factors (IGFs) by cleaving IGF-binding proteins. It has also been suggested to be a regulator of cell growth. Variations in the promoter region of this gene are the cause of susceptibility to age-related macular degeneration type 7. | 5654 | ENSG00000166033 | HTRA1 | HtrA serine peptidase 1 | NA |
| This gene encodes a serine/threonine protein kinase that localizes to mitochondria. It is thought to protect cells from stress-induced mitochondrial dysfunction. Mutations in this gene cause one form of autosomal recessive early-onset Parkinson disease. | 65018 | ENSG00000158828 | PINK1 | PTEN induced putative kinase 1 | NA |
| NA | NA | ENSG00000256845 | NA | NA | TRUE |
| Angiomotin is a protein that binds angiostatin, a circulating inhibitor of the formation of new blood vessels (angiogenesis). Angiomotin mediates angiostatin inhibition of endothelial cell migration and tube formation in vitro. The protein encoded by this gene is related to angiomotin and is a member of the motin protein family. Alternative splicing results in multiple transcript variants of this gene. | 51421 | ENSG00000114019 | AMOTL2 | angiomotin like 2 | NA |
| NA | 100505635 | ENSG00000235033 | LOC100505635 | uncharacterized LOC100505635 | NA |
| NA | 26035 | ENSG00000138604 | GLCE | glucuronic acid epimerase | NA |
| This gene encodes a memberof the transient receptor potential (TRP) cation channel gene family. The transmembrane protein localizes to intracellular vesicular membranes including lysosomes, and functions in the late endocytic pathway and in the regulation of lysosomal exocytosis. The channel is permeable to Ca(2+), Fe(2+), Na(+), K(+), and H(+), and is modulated by changes in Ca(2+) concentration. Mutations in this gene result in mucolipidosis type IV. | 57192 | ENSG00000090674 | MCOLN1 | mucolipin 1 | NA |
| NA | 64798 | ENSG00000155792 | DEPTOR | DEP domain containing MTOR-interacting protein | NA |
| NA | 404093 | ENSG00000180891 | CUEDC1 | CUE domain containing 1 | NA |
| This gene encodes a coiled-coil and calcium binding domain protein that appears to play a critical role in cilia formation. Mutations in this gene cause Meckel syndrome type 6, as well as Joubert syndrome type 9. Alternative splicing results in multiple transcript variants. | 57545 | ENSG00000048342 | CC2D2A | coiled-coil and C2 domain containing 2A | NA |
| The protein encoded by this gene is a member of the tripartite motif (TRIM) family. The TRIM motif includes three zinc-binding domains, a RING, a B-box type 1 and a B-box type 2, and a coiled-coil region. The protein localizes to cytoplasmic filaments. It plays a neuroprotective role and functions as an E3-ubiquitin ligase in proteasome-mediated degradation of target proteins. Mutations in this gene can cause early-onset axonal neuropathy. Alternative splicing results in multiple transcript variants. | 23321 | ENSG00000109654 | TRIM2 | tripartite motif containing 2 | NA |
| NA | 146880 | ENSG00000215769 | LOC146880 | Rho GTPase activating protein 27 pseudogene | NA |
| NA | 57037 | ENSG00000106524 | ANKMY2 | ankyrin repeat and MYND domain containing 2 | NA |
| NA | ENSG00000267546 | ENSG00000267546 | RP11-666A8.8 | NA | NA |
| NA | ENSG00000273219 | ENSG00000273219 | RP11-644N4.1 | NA | NA |
| This gene encodes a membrane-bound protein from the major facilitator superfamily of transporters. Disruption of this gene by translocation has been associated with haplo-insufficiency and renal cell carcinomas. Alternatively spliced transcript variants have been described, but their biological validity has not yet been determined. | 84925 | ENSG00000138463 | DIRC2 | disrupted in renal carcinoma 2 | NA |
| The protein encoded by this gene is a DNA-binding, leucine zipper-containing transcription factor that acts as a homodimer or as a heterodimer. Depending on the binding site and binding partner, the encoded protein can be a transcriptional activator or repressor. This protein plays a role in the regulation of several cellular processes, including embryonic lens fiber cell development, increased T-cell susceptibility to apoptosis, and chondrocyte terminal differentiation. Defects in this gene are a cause of juvenile-onset pulverulent cataract as well as congenital cerulean cataract 4 (CCA4). Two transcript variants encoding different isoforms have been found for this gene. | 4094 | ENSG00000178573 | MAF | MAF bZIP transcription factor | NA |
| NA | 54621 | ENSG00000176834 | VSIG10 | V-set and immunoglobulin domain containing 10 | NA |
| Rho GTPases play a fundamental role in numerous cellular processes that are initiated by extracellular stimuli that work through G protein coupled receptors. The encoded protein may form a complex with G proteins and stimulate Rho-dependent signals. A similar protein in rat interacts with glutamate transporter EAAT4 and modulates its glutamate transport activity. Expression of the rat protein induces the reorganization of the actin cytoskeleton and its overexpression induces the formation of membrane ruffling and filopodia. Two alternative transcripts encoding different isoforms have been described. | 9826 | ENSG00000132694 | ARHGEF11 | Rho guanine nucleotide exchange factor 11 | NA |
| NA | 57333 | ENSG00000142552 | RCN3 | reticulocalbin 3 | NA |
| This gene encodes amyloid precursor- like protein 2 (APLP2), which is a member of the APP (amyloid precursor protein) family including APP, APLP1 and APLP2. This protein is ubiquitously expressed. It contains heparin-, copper- and zinc- binding domains at the N-terminus, BPTI/Kunitz inhibitor and E2 domains in the middle region, and transmembrane and intracellular domains at the C-terminus. This protein interacts with major histocompatibility complex (MHC) class I molecules. The synergy of this protein and the APP is required to mediate neuromuscular transmission, spatial learning and synaptic plasticity. This protein has been implicated in the pathogenesis of Alzheimer’s disease. Multiple alternatively spliced transcript variants encoding different isoforms have been identified. | 334 | ENSG00000084234 | APLP2 | amyloid beta precursor like protein 2 | NA |
| NA | 149076 | ENSG00000160094 | ZNF362 | zinc finger protein 362 | NA |
| Prenylcysteine is released during the degradation of prenylated proteins. PCYOX1 catalyzes the degradation of prenylcysteine to yield free cysteines and a hydrophobic isoprenoid product (Tschantz et al., 1999 [PubMed 10585463]). | 51449 | ENSG00000116005 | PCYOX1 | prenylcysteine oxidase 1 | NA |
| NA | 285512 | ENSG00000248019 | FAM13A-AS1 | FAM13A antisense RNA 1 | NA |
| This gene encodes a cytosolic enzyme that catalyzes the activation of acetate for use in lipid synthesis and energy generation. The protein acts as a monomer and produces acetyl-CoA from acetate in a reaction that requires ATP. Expression of this gene is regulated by sterol regulatory element-binding proteins, transcription factors that activate genes required for the synthesis of cholesterol and unsaturated fatty acids. Alternative splicing results in multiple transcript variants. | 55902 | ENSG00000131069 | ACSS2 | acyl-CoA synthetase short-chain family member 2 | NA |
| This gene encodes a deoxyribonucleoside kinase that specifically phosphorylates thymidine, deoxycytidine, and deoxyuridine. The encoded enzyme localizes to the mitochondria and is required for mitochondrial DNA synthesis. Mutations in this gene are associated with a myopathic form of mitochondrial DNA depletion syndrome. Alternate splicing results in multiple transcript variants encoding distinct isoforms, some of which lack transit peptide, so are not localized to mitochondria. | 7084 | ENSG00000166548 | TK2 | thymidine kinase 2, mitochondrial | NA |
| NA | 10079 | ENSG00000054793 | ATP9A | ATPase phospholipid transporting 9A (putative) | NA |
| NA | 221442 | ENSG00000161912 | ADCY10P1 | adenylate cyclase 10 (soluble) pseudogene 1 | NA |
| The protein encoded by this gene is a type II integral membrane protein that belongs to the 3-O-sulfotransferases family. These proteins catalyze the addition of sulfate groups at the 3-OH position of glucosamine in heparan sulfate. The substrate specificity of individual members of the family is based on prior modification of the heparan sulfate chain, thus allowing different members of the family to generate binding sites for different proteins on the same heparan sulfate chain. Following treatment with a histone deacetylase inhibitor, expression of this gene is activated in a pancreatic cell line. The increased expression results in promotion of the epithelial-mesenchymal transition. In addition, the modification catalyzed by this protein allows herpes simplex virus membrane fusion and penetration. A very closely related homolog with an almost identical sulfotransferase domain maps less than 1 Mb away. Alternative splicing results in multiple transcript variants. | 9953 | ENSG00000125430 | HS3ST3B1 | heparan sulfate-glucosamine 3-sulfotransferase 3B1 | NA |
| NA | ENSG00000227201 | ENSG00000227201 | CNN2P1 | calponin 2 pseudogene 1 | NA |
| NA | ENSG00000261064 | ENSG00000261064 | RP11-1000B6.3 | NA | NA |
| C6ORF49 is a member of the LIM domain protein family (Teufel et al., 2005 [PubMed 15702247]). | 29964 | ENSG00000124593 | PRICKLE4 | prickle planar cell polarity protein 4 | NA |
| NA | ENSG00000200278 | ENSG00000200278 | RNA5SP352 | RNA, 5S ribosomal pseudogene 352 | NA |
| NA | 80221 | ENSG00000167107 | ACSF2 | acyl-CoA synthetase family member 2 | NA |
| NA | NA | ENSG00000230633 | NA | NA | TRUE |
| This gene encodes the mitochondrial enzyme ornithine aminotransferase, which is a key enzyme in the pathway that converts arginine and ornithine into the major excitatory and inhibitory neurotransmitters glutamate and GABA. Mutations that result in a deficiency of this enzyme cause the autosomal recessive eye disease Gyrate Atrophy. Alternatively spliced transcript variants encoding different isoforms have been described. Related pseudogenes have been defined on the X chromosome. | 4942 | ENSG00000065154 | OAT | ornithine aminotransferase | NA |
| This gene encodes a member of the SOX (SRY-related HMG-box) family of transcription factors involved in the regulation of embryonic development and in the determination of cell fate. The encoded protein may act as a transcriptional regulator after forming a protein complex with other proteins. It has also been determined to be a type-1 diabetes autoantigen, also known as islet cell antibody 12. | 9580 | ENSG00000143842 | SOX13 | SRY-box 13 | NA |
| The protein encoded by this gene is a member of the gelsolin/villin family of actin regulatory proteins. This protein has structural similarity to villin. It binds actin and may play a role in the development of neuronal cells that form ganglia. | 10677 | ENSG00000135407 | AVIL | advillin | NA |
| The protein encoded by this gene acts as a homodimer, using ATP hydrolysis to catalyze the conversion of 5-oxo-L-proline to L-glutamate. Defects in this gene are a cause of 5-oxoprolinase deficiency (OPLAHD). | 26873 | ENSG00000178814 | OPLAH | 5-oxoprolinase (ATP-hydrolysing) | NA |
| NA | 54884 | ENSG00000042445 | RETSAT | retinol saturase | NA |
| This gene encodes a cytoskeletal protein involved in actin-membrane attachment at sites of cell adhesion to the extracellular matrix (focal adhesion). Alternatively spliced transcript variants encoding different isoforms have been described for this gene. These isoforms exhibit different expression pattern, and have different biochemical, as well as physiological properties (PMID:9054445). | 5829 | ENSG00000089159 | PXN | paxillin | NA |
| This gene encodes an enzyme which removes 9-O-acetylation modifications from sialic acids. Mutations in this gene are associated with susceptibility to autoimmune disease 6. Multiple transcript variants encoding different isoforms, found either in the cytosol or in the lysosome, have been found for this gene. | 54414 | ENSG00000110013 | SIAE | sialic acid acetylesterase | NA |
| NA | 57515 | ENSG00000111897 | SERINC1 | serine incorporator 1 | NA |
| NA | ENSG00000255857 | ENSG00000255857 | PXN-AS1 | PXN antisense RNA 1 | NA |
| NA | NA | ENSG00000256142 | NA | NA | TRUE |
| This gene belongs to the chemokine-like factor gene superfamily, a novel family that is similar to the chemokine and the transmembrane 4 superfamilies of signaling molecules. This gene is one of several chemokine-like factor genes located in a cluster on chromosome 16. Alternatively spliced transcript variants encoding different isoforms have been identified. | 146223 | ENSG00000183723 | CMTM4 | CKLF like MARVEL transmembrane domain containing 4 | NA |
| NA | 81553 | ENSG00000197872 | FAM49A | family with sequence similarity 49 member A | NA |
| NA | NA | ENSG00000272091 | NA | NA | TRUE |
| The product of this gene belongs to the Serine/Threonine protein kinase family, and to the Ca(2+)/calmodulin-dependent protein kinase subfamily. The major isoform of this gene plays a role in the calcium/calmodulin-dependent (CaM) kinase cascade by phosphorylating the downstream kinases CaMK1 and CaMK4. Protein products of this gene also phosphorylate AMP-activated protein kinase (AMPK). This gene has its strongest expression in the brain and influences signalling cascades involved with learning and memory, neuronal differentiation and migration, neurite outgrowth, and synapse formation. Alternative splicing results in multiple transcript variants encoding distinct isoforms. The identified isoforms differ in their ability to undergo autophosphorylation and to phosphorylate downstream kinases. | 10645 | ENSG00000110931 | CAMKK2 | calcium/calmodulin-dependent protein kinase kinase 2 | NA |
| NA | ENSG00000269976 | ENSG00000269976 | RP11-130L8.2 | NA | NA |
| This gene encodes an alpha chain for one of the low abundance fibrillar collagens. Fibrillar collagen molecules are trimers that can be composed of one or more types of alpha chains. Type V collagen is found in tissues containing type I collagen and appears to regulate the assembly of heterotypic fibers composed of both type I and type V collagen. This gene product is closely related to type XI collagen and it is possible that the collagen chains of types V and XI constitute a single collagen type with tissue-specific chain combinations. Mutations in this gene are thought to be responsible for the symptoms of a subset of patients with Ehlers-Danlos syndrome type III. Messages of several sizes can be detected in northern blots but sequence information cannot confirm the identity of the shorter messages. | 50509 | ENSG00000080573 | COL5A3 | collagen type V alpha 3 | NA |
| Syntrophins are cytoplasmic peripheral membrane scaffold proteins that are components of the dystrophin-associated protein complex. This gene is a member of the syntrophin gene family and encodes the most common syntrophin isoform found in cardiac tissues. The N-terminal PDZ domain of this syntrophin protein interacts with the C-terminus of the pore-forming alpha subunit (SCN5A) of the cardiac sodium channel Nav1.5. This protein also associates cardiac sodium channels with the nitric oxide synthase-PMCA4b (plasma membrane Ca-ATPase subtype 4b) complex in cardiomyocytes. This gene is a susceptibility locus for Long-QT syndrome (LQT) - an inherited disorder associated with sudden cardiac death from arrhythmia - and sudden infant death syndrome (SIDS). This protein also associates with dystrophin and dystrophin-related proteins at the neuromuscular junction and alters intracellular calcium ion levels in muscle tissue. | 6640 | ENSG00000101400 | SNTA1 | syntrophin alpha 1 | NA |
| The protein encoded by this gene binds to the ‘plus’ ends of actin monomers and filaments to prevent monomer exchange. The encoded calcium-regulated protein functions in both assembly and disassembly of actin filaments. Defects in this gene are a cause of familial amyloidosis Finnish type (FAF). Multiple transcript variants encoding several different isoforms have been found for this gene. | 2934 | ENSG00000148180 | GSN | gelsolin | NA |
| NA | 55652 | ENSG00000211584 | SLC48A1 | solute carrier family 48 member 1 | NA |
| NA | ENSG00000254317 | ENSG00000254317 | RP11-473O4.5 | NA | NA |
| NA | ENSG00000237781 | ENSG00000237781 | RP11-54A4.2 | NA | NA |
| NA | 146691 | ENSG00000175662 | TOM1L2 | target of myb1 like 2 membrane trafficking protein | NA |
| This gene encodes a phox (PX) domain-containing protein which may be involved in synaptic transmission and the ligand-induced internalization and degradation of epidermal growth factors. Variations in this gene may be associated with susceptibility to systemic lupus erythematosus (SLE). Alternative splicing results in multiple transcript variants. | 54899 | ENSG00000168297 | PXK | PX domain containing serine/threonine kinase like | NA |
| This gene encodes a protein similar to guanosine nucleotide exchange factors for Rho GTPases. The encoded protein contains in its C-terminus a GEF domain involved in exchange activity and a pleckstrin homology domain. Alternatively spliced transcripts that encode different proteins have been described. | 55701 | ENSG00000165801 | ARHGEF40 | Rho guanine nucleotide exchange factor 40 | NA |
| NA | 11328 | ENSG00000122642 | FKBP9 | FK506 binding protein 9 | NA |
| This gene encodes a dual serine/threonine and tyrosine protein kinase which is expressed in multiple tissues. It is thought to function as a regulator of cell death. Multiple transcript variants encoding different isoforms have been found for this gene. | 25778 | ENSG00000133059 | DSTYK | dual serine/threonine and tyrosine protein kinase | NA |
| This gene encodes a highly conserved preproprotein that is proteolytically processed to generate four main cleavage products including saposins A, B, C, and D. Each domain of the precursor protein is approximately 80 amino acid residues long with nearly identical placement of cysteine residues and glycosylation sites. Saposins A-D localize primarily to the lysosomal compartment where they facilitate the catabolism of glycosphingolipids with short oligosaccharide groups. The precursor protein exists both as a secretory protein and as an integral membrane protein and has neurotrophic activities. Mutations in this gene have been associated with Gaucher disease and metachromatic leukodystrophy. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that is proteolytically processed. | 5660 | ENSG00000197746 | PSAP | prosaposin | NA |
| NA | 387680 | ENSG00000099290 | FAM21A | family with sequence similarity 21 member A | NA |
| NA | 100134229 | ENSG00000260231 | JHDM1D-AS1 | JHDM1D antisense RNA 1 (head to head) | NA |
| NA | 122953 | ENSG00000140044 | JDP2 | Jun dimerization protein 2 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",6,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[7,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | X_id | name | query | summary | notfound |
|---|---|---|---|---|---|
| TNNC1 | 7134 | troponin C1, slow skeletal and cardiac type | ENSG00000114854 | Troponin is a central regulatory protein of striated muscle contraction, and together with tropomyosin, is located on the actin filament. Troponin consists of 3 subunits: TnI, which is the inhibitor of actomyosin ATPase; TnT, which contains the binding site for tropomyosin; and TnC, the protein encoded by this gene. The binding of calcium to TnC abolishes the inhibitory action of TnI, thus allowing the interaction of actin with myosin, the hydrolysis of ATP, and the generation of tension. Mutations in this gene are associated with cardiomyopathy dilated type 1Z. | NA |
| MYOZ2 | 51778 | myozenin 2 | ENSG00000172399 | The protein encoded by this gene belongs to a family of sarcomeric proteins that bind to calcineurin, a phosphatase involved in calcium-dependent signal transduction in diverse cell types. These family members tether calcineurin to alpha-actinin at the z-line of the sarcomere of cardiac and skeletal muscle cells, and thus they are important for calcineurin signaling. Mutations in this gene cause cardiomyopathy familial hypertrophic type 16, a hereditary heart disorder. | NA |
| EEF1A2 | 1917 | eukaryotic translation elongation factor 1 alpha 2 | ENSG00000101210 | This gene encodes an isoform of the alpha subunit of the elongation factor-1 complex, which is responsible for the enzymatic delivery of aminoacyl tRNAs to the ribosome. This isoform (alpha 2) is expressed in brain, heart and skeletal muscle, and the other isoform (alpha 1) is expressed in brain, placenta, lung, liver, kidney, and pancreas. This gene may be critical in the development of ovarian cancer. | NA |
| CRYM | 1428 | crystallin mu | ENSG00000103316 | Crystallins are separated into two classes: taxon-specific and ubiquitous. The former class is also called phylogenetically-restricted crystallins. The latter class constitutes the major proteins of vertebrate eye lens and maintains the transparency and refractive index of the lens. This gene encodes a taxon-specific crystallin protein that binds NADPH and has sequence similarity to bacterial ornithine cyclodeaminases. The encoded protein does not perform a structural role in lens tissue, and instead it binds thyroid hormone for possible regulatory or developmental roles. Mutations in this gene have been associated with autosomal dominant non-syndromic deafness. | NA |
| PKP2 | 5318 | plakophilin 2 | ENSG00000057294 | This gene encodes a member of the arm-repeat (armadillo) and plakophilin gene families. Plakophilin proteins contain numerous armadillo repeats, localize to cell desmosomes and nuclei, and participate in linking cadherins to intermediate filaments in the cytoskeleton. This gene product may regulate the signaling activity of beta-catenin. Two alternately spliced transcripts encoding two protein isoforms have been identified. A processed pseudogene with high similarity to this locus has been mapped to chromosome 12p13. | NA |
| HRC | 3270 | histidine rich calcium binding protein | ENSG00000130528 | This gene encodes a luminal sarcoplasmic reticulum protein identified by its ability to bind low-density lipoprotein with high affinity. The protein interacts with the cytoplasmic domain of triadin, the main transmembrane protein of the junctional sarcoplasmic reticulum (SR) of skeletal muscle. The protein functions in the regulation of releasable calcium into the SR. | NA |
| FAM78A | 286336 | family with sequence similarity 78 member A | ENSG00000126882 | NA | NA |
| TCAP | 8557 | titin-cap | ENSG00000173991 | Sarcomere assembly is regulated by the muscle protein titin. Titin is a giant elastic protein with kinase activity that extends half the length of a sarcomere. It serves as a scaffold to which myofibrils and other muscle related proteins are attached. This gene encodes a protein found in striated and cardiac muscle that binds to the titin Z1-Z2 domains and is a substrate of titin kinase, interactions thought to be critical to sarcomere assembly. Mutations in this gene are associated with limb-girdle muscular dystrophy type 2G. | NA |
| WDR62 | 284403 | WD repeat domain 62 | ENSG00000075702 | This gene is proposed to play a role in cerebral cortical development. Mutations in this gene have been associated with microencephaly, cortical malformations, and mental retardation. Alternative splicing results in multiple transcript variants. | NA |
| PTGDS | 5730 | prostaglandin D2 synthase | ENSG00000107317 | The protein encoded by this gene is a glutathione-independent prostaglandin D synthase that catalyzes the conversion of prostaglandin H2 (PGH2) to postaglandin D2 (PGD2). PGD2 functions as a neuromodulator as well as a trophic factor in the central nervous system. PGD2 is also involved in smooth muscle contraction/relaxation and is a potent inhibitor of platelet aggregation. This gene is preferentially expressed in brain. Studies with transgenic mice overexpressing this gene suggest that this gene may be also involved in the regulation of non-rapid eye movement sleep. | NA |
| ACTN2 | 88 | actinin alpha 2 | ENSG00000077522 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a muscle-specific, alpha actinin isoform that is expressed in both skeletal and cardiac muscles. Several transcript variants encoding different isoforms have been found for this gene. | NA |
| MYH7B | 57644 | myosin, heavy chain 7B, cardiac muscle, beta | ENSG00000078814 | The myosin II molecule is a multi-subunit complex consisting of two heavy chains and four light chains. This gene encodes a heavy chain of myosin II, which is a member of the motor-domain superfamily. The heavy chain includes a globular motor domain, which catalyzes ATP hydrolysis and interacts with actin, and a tail domain in which heptad repeat sequences promote dimerization by interacting to form a rod-like alpha-helical coiled coil. This heavy chain subunit is a slow-twitch myosin. Alternatively spliced transcript variants have been found, but the full-length nature of these variants is not determined. | NA |
| ACTC1 | 70 | actin, alpha, cardiac muscle 1 | ENSG00000159251 | Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC). | NA |
| RP11-11N9.4 | ENSG00000247134 | NA | ENSG00000247134 | NA | NA |
| VSTM2L | 128434 | V-set and transmembrane domain containing 2 like | ENSG00000132821 | NA | NA |
| CASQ2 | 845 | calsequestrin 2 | ENSG00000118729 | The protein encoded by this gene specifies the cardiac muscle family member of the calsequestrin family. Calsequestrin is localized to the sarcoplasmic reticulum in cardiac and slow skeletal muscle cells. The protein is a calcium binding protein that stores calcium for muscle function. Mutations in this gene cause stress-induced polymorphic ventricular tachycardia, also referred to as catecholaminergic polymorphic ventricular tachycardia 2 (CPVT2), a disease characterized by bidirectional ventricular tachycardia that may lead to cardiac arrest. | NA |
| LDB3 | 11155 | LIM domain binding 3 | ENSG00000122367 | This gene encodes a PDZ domain-containing protein. PDZ motifs are modular protein-protein interaction domains consisting of 80-120 amino acid residues. PDZ domain-containing proteins interact with each other in cytoskeletal assembly or with other proteins involved in targeting and clustering of membrane proteins. The protein encoded by this gene interacts with alpha-actinin-2 through its N-terminal PDZ domain and with protein kinase C via its C-terminal LIM domains. The LIM domain is a cysteine-rich motif defined by 50-60 amino acids containing two zinc-binding modules. This protein also interacts with all three members of the myozenin family. Mutations in this gene have been associated with myofibrillar myopathy and dilated cardiomyopathy. Alternatively spliced transcript variants encoding different isoforms have been identified; all isoforms have N-terminal PDZ domains while only longer isoforms (1, 2 and 5) have C-terminal LIM domains. | NA |
| SLC2A4 | 6517 | solute carrier family 2 member 4 | ENSG00000181856 | This gene is a member of the solute carrier family 2 (facilitated glucose transporter) family and encodes a protein that functions as an insulin-regulated facilitative glucose transporter. In the absence of insulin, this integral membrane protein is sequestered within the cells of muscle and adipose tissue. Within minutes of insulin stimulation, the protein moves to the cell surface and begins to transport glucose across the cell membrane. Mutations in this gene have been associated with noninsulin-dependent diabetes mellitus (NIDDM). | NA |
| SHISA3 | 152573 | shisa family member 3 | ENSG00000178343 | NA | NA |
| TNFAIP6 | 7130 | TNF alpha induced protein 6 | ENSG00000123610 | The protein encoded by this gene is a secretory protein that contains a hyaluronan-binding domain, and thus is a member of the hyaluronan-binding protein family. The hyaluronan-binding domain is known to be involved in extracellular matrix stability and cell migration. This protein has been shown to form a stable complex with inter-alpha-inhibitor (I alpha I), and thus enhance the serine protease inhibitory activity of I alpha I, which is important in the protease network associated with inflammation. This gene can be induced by proinflammatory cytokines such as tumor necrosis factor alpha and interleukin-1. Enhanced levels of this protein are found in the synovial fluid of patients with osteoarthritis and rheumatoid arthritis. | NA |
| FNDC5 | 252995 | fibronectin type III domain containing 5 | ENSG00000160097 | This gene encodes a secreted protein that is released from muscle cells during exercise. The encoded protein may participate in the development of brown fat. Translation of the precursor protein initiates at a non-AUG start codon at a position that is conserved as an AUG start codon in other organisms. Alternative splicing results in multiple transcript variants. | NA |
| AC017116.11 | ENSG00000239775 | NA | ENSG00000239775 | NA | NA |
| PI16 | 221476 | peptidase inhibitor 16 | ENSG00000164530 | NA | NA |
| LTK | 4058 | leukocyte receptor tyrosine kinase | ENSG00000062524 | The protein encoded by this gene is a member of the ros/insulin receptor family of tyrosine kinases. Tyrosine-specific phosphorylation of proteins is a key to the control of diverse pathways leading to cell growth and differentiation. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| FHOD3 | 80206 | formin homology 2 domain containing 3 | ENSG00000134775 | The protein encoded by this gene is a member of the diaphanous-related formins (DRF), and contains multiple domains, including GBD (GTPase-binding domain), DID (diaphanous inhibitory domain), FH1 (formin homology 1), FH2 (formin homology 2), and DAD (diaphanous auto-regulatory domain) domains. This protein is thought to play a role in actin filament polymerization in cardiomyocytes. Mutations in this gene have been associated with dilated cardiomyopathy (DCM), characterized by dilation of the ventricular chamber, leading to impairment of systolic pump function and subsequent heart failure. Increased levels of the protein encoded by this gene have been observed in individuals with hypertrophic cardiomyopathy (HCM). Alternative splicing results in multiple transcript variants encoding different isoforms. A muscle-specific isoform has been shown to possess a casein kinase 2 (CK2) phosphorylation site at the C-terminal end of the FH2 domain. Phosphorylation of this site alters its interaction with sequestosome 1 (SQSTM1), and targets this isoform to myofibrils, while other isoforms form cytoplasmic aggregates. | NA |
| TSPAN32 | 10077 | tetraspanin 32 | ENSG00000064201 | This gene, which is a member of the tetraspanin superfamily, is one of several tumor-suppressing subtransferable fragments located in the imprinted gene domain of chromosome 11p15.5, an important tumor-suppressor gene region. Alterations in this region have been associated with Beckwith-Wiedemann syndrome, Wilms tumor, rhabdomyosarcoma, adrenocortical carcinoma, and lung, ovarian and breast cancers. This gene is located among several imprinted genes; however, this gene, as well as the tumor-suppressing subchromosomal transferable fragment 4, escapes imprinting. This gene may play a role in malignancies and diseases that involve this region, and it is also involved in hematopoietic cell function. Alternatively spliced transcript variants have been described, but their biological validity has not been determined. | NA |
| BCHE | 590 | butyrylcholinesterase | ENSG00000114200 | Mutant alleles at the BCHE locus are responsible for suxamethonium sensitivity. Homozygous persons sustain prolonged apnea after administration of the muscle relaxant suxamethonium in connection with surgical anesthesia. The activity of pseudocholinesterase in the serum is low and its substrate behavior is atypical. In the absence of the relaxant, the homozygote is at no known disadvantage. | NA |
| LPL | 4023 | lipoprotein lipase | ENSG00000175445 | LPL encodes lipoprotein lipase, which is expressed in heart, muscle, and adipose tissue. LPL functions as a homodimer, and has the dual functions of triglyceride hydrolase and ligand/bridging factor for receptor-mediated lipoprotein uptake. Severe mutations that cause LPL deficiency result in type I hyperlipoproteinemia, while less extreme mutations in LPL are linked to many disorders of lipoprotein metabolism. | NA |
| CELF2-AS1 | 414196 | CELF2 antisense RNA 1 | ENSG00000181800 | NA | NA |
| TNFRSF19 | 55504 | tumor necrosis factor receptor superfamily member 19 | ENSG00000127863 | The protein encoded by this gene is a member of the TNF-receptor superfamily. This receptor is highly expressed during embryonic development. It has been shown to interact with TRAF family members, and to activate JNK signaling pathway when overexpressed in cells. This receptor is capable of inducing apoptosis by a caspase-independent mechanism, and it is thought to play an essential role in embryonic development. Alternatively spliced transcript variants encoding distinct isoforms have been described. | NA |
| MLF1 | 4291 | myeloid leukemia factor 1 | ENSG00000178053 | This gene encodes an oncoprotein which is thought to play a role in the phenotypic determination of hemopoetic cells. Translocations between this gene and nucleophosmin have been associated with myelodysplastic syndrome and acute myeloid leukemia. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| BLM | 641 | Bloom syndrome RecQ like helicase | ENSG00000197299 | The Bloom syndrome gene product is related to the RecQ subset of DExH box-containing DNA helicases and has both DNA-stimulated ATPase and ATP-dependent DNA helicase activities. Mutations causing Bloom syndrome delete or alter helicase motifs and may disable the 3’-5’ helicase activity. The normal protein may act to suppress inappropriate recombination. | NA |
| TRPV3 | 162514 | transient receptor potential cation channel subfamily V member 3 | ENSG00000167723 | This gene product belongs to a family of nonselective cation channels that function in a variety of processes, including temperature sensation and vasoregulation. The thermosensitive members of this family are expressed in subsets of sensory neurons that terminate in the skin, and are activated at distinct physiological temperatures. This channel is activated at temperatures between 22 and 40 degrees C. This gene lies in close proximity to another family member gene on chromosome 17, and the two encoded proteins are thought to associate with each other to form heteromeric channels. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| STRIP2 | 57464 | striatin interacting protein 2 | ENSG00000128578 | NA | NA |
| TSPAN18 | 90139 | tetraspanin 18 | ENSG00000157570 | NA | NA |
| ANK2 | 287 | ankyrin 2, neuronal | ENSG00000145362 | This gene encodes a member of the ankyrin family of proteins that link the integral membrane proteins to the underlying spectrin-actin cytoskeleton. Ankyrins play key roles in activities such as cell motility, activation, proliferation, contact and the maintenance of specialized membrane domains. Most ankyrins are typically composed of three structural domains: an amino-terminal domain containing multiple ankyrin repeats; a central region with a highly conserved spectrin binding domain; and a carboxy-terminal regulatory domain which is the least conserved and subject to variation. The protein encoded by this gene is required for targeting and stability of Na/Ca exchanger 1 in cardiomyocytes. Mutations in this gene cause long QT syndrome 4 and cardiac arrhythmia syndrome. Multiple transcript variants encoding different isoforms have been described. | NA |
| SGCA | 6442 | sarcoglycan alpha | ENSG00000108823 | This gene encodes a component of the dystrophin-glycoprotein complex (DGC), which is critical to the stability of muscle fiber membranes and to the linking of the actin cytoskeleton to the extracellular matrix. Its expression is thought to be restricted to striated muscle. Mutations in this gene result in type 2D autosomal recessive limb-girdle muscular dystrophy. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| PTGES3L | 100885848 | prostaglandin E synthase 3 (cytosolic)-like | ENSG00000267060 | NA | NA |
| SAMD4A | 23034 | sterile alpha motif domain containing 4A | ENSG00000020577 | Sterile alpha motifs (SAMs) in proteins such as SAMD4A are part of an RNA-binding domain that functions as a posttranscriptional regulator by binding to an RNA sequence motif known as the Smaug recognition element, which was named after the Drosophila Smaug protein (Baez and Boccaccio, 2005 [PubMed 16221671]). | NA |
| ADAMTS7 | 11173 | ADAM metallopeptidase with thrombospondin type 1 motif 7 | ENSG00000136378 | The protein encoded by this gene is a member of the ADAMTS (a disintegrin and metalloproteinase with thrombospondin motifs) family. Members of this family share several distinct protein modules, including a propeptide region, a metalloproteinase domain, a disintegrin-like domain, and a thrombospondin type 1 (TS) motif. Individual members of this family differ in the number of C-terminal TS motifs, and some have unique C-terminal domains. The encoded preproprotein is proteolytically processed to generate the mature enzyme. This enzyme contains two C-terminal TS motifs and may regulate vascular smooth muscle cell (VSMC) migration. Mutations in this gene may be associated with susceptibility to coronary artery disease. | NA |
| NA | NA | NA | ENSG00000229164 | NA | TRUE |
| CLPS | 1208 | colipase | ENSG00000137392 | The protein encoded by this gene is a cofactor needed by pancreatic lipase for efficient dietary lipid hydrolysis. It binds to the C-terminal, non-catalytic domain of lipase, thereby stabilizing an active conformation and considerably increasing the overall hydrophobic binding site. The gene product allows lipase to anchor noncovalently to the surface of lipid micelles, counteracting the destabilizing influence of intestinal bile salts. This cofactor is only expressed in pancreatic acinar cells, suggesting regulation of expression by tissue-specific elements. Three transcript variants encoding different isoforms have been found for this gene. | NA |
| NBPF13P | ENSG00000227242 | neuroblastoma breakpoint family member 13, pseudogene | ENSG00000227242 | NA | NA |
| TMEM71 | 137835 | transmembrane protein 71 | ENSG00000165071 | NA | NA |
| COL23A1 | 91522 | collagen type XXIII alpha 1 chain | ENSG00000050767 | COL23A1 is a member of the transmembrane collagens, a subfamily of the nonfibrillar collagens that contain a single pass hydrophobic transmembrane domain (Banyard et al., 2003 [PubMed 12644459]). | NA |
| GAS7 | 8522 | growth arrest specific 7 | ENSG00000007237 | Growth arrest-specific 7 is expressed primarily in terminally differentiated brain cells and predominantly in mature cerebellar Purkinje neurons. GAS7 plays a putative role in neuronal development. Several transcript variants encoding proteins which vary in the N-terminus have been described. | NA |
| FLVCR2 | 55640 | feline leukemia virus subgroup C cellular receptor family member 2 | ENSG00000119686 | This gene encodes a member of the major facilitator superfamily. The encoded transmembrane protein is a calcium transporter. Unlike the related protein feline leukemia virus subgroup C receptor 1, the protein encoded by this locus does not bind to feline leukemia virus subgroup C envelope protein. The encoded protein may play a role in development of brain vascular endothelial cells, as mutations at this locus have been associated with proliferative vasculopathy and hydranencephaly-hydrocephaly syndrome. Alternatively spliced transcript variants have been described. | NA |
| FHL2 | 2274 | four and a half LIM domains 2 | ENSG00000115641 | This gene encodes a member of the four-and-a-half-LIM-only protein family. Family members contain two highly conserved, tandemly arranged, zinc finger domains with four highly conserved cysteines binding a zinc atom in each zinc finger. This protein is thought to have a role in the assembly of extracellular membranes. Also, this gene is down-regulated during transformation of normal myoblasts to rhabdomyosarcoma cells and the encoded protein may function as a link between presenilin-2 and an intracellular signaling pathway. Multiple alternatively spliced variants encoding different isoforms have been identified. | NA |
| PMP22 | 5376 | peripheral myelin protein 22 | ENSG00000109099 | This gene encodes an integral membrane protein that is a major component of myelin in the peripheral nervous system. Studies suggest two alternately used promoters drive tissue-specific expression. Various mutations of this gene are causes of Charcot-Marie-Tooth disease Type IA, Dejerine-Sottas syndrome, and hereditary neuropathy with liability to pressure palsies. Alternative splicing results in multiple transcript variants. | NA |
| DNAJA4 | 55466 | DnaJ heat shock protein family (Hsp40) member A4 | ENSG00000140403 | NA | NA |
| DPYSL4 | 10570 | dihydropyrimidinase like 4 | ENSG00000151640 | NA | NA |
| RP11-762H8.4 | ENSG00000272418 | NA | ENSG00000272418 | NA | NA |
| RNF157 | 114804 | ring finger protein 157 | ENSG00000141576 | NA | NA |
| RP11-54F2.1 | ENSG00000251196 | NA | ENSG00000251196 | NA | NA |
| CELF2 | 10659 | CUGBP, Elav-like family member 2 | ENSG00000048740 | Members of the CELF/BRUNOL protein family contain two N-terminal RNA recognition motif (RRM) domains, one C-terminal RRM domain, and a divergent segment of 160-230 aa between the second and third RRM domains. Members of this protein family regulate pre-mRNA alternative splicing and may also be involved in mRNA editing, and translation. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| PLN | 5350 | phospholamban | ENSG00000198523 | The protein encoded by this gene is found as a pentamer and is a major substrate for the cAMP-dependent protein kinase in cardiac muscle. The encoded protein is an inhibitor of cardiac muscle sarcoplasmic reticulum Ca(2+)-ATPase in the unphosphorylated state, but inhibition is relieved upon phosphorylation of the protein. The subsequent activation of the Ca(2+) pump leads to enhanced muscle relaxation rates, thereby contributing to the inotropic response elicited in heart by beta-agonists. The encoded protein is a key regulator of cardiac diastolic function. Mutations in this gene are a cause of inherited human dilated cardiomyopathy with refractory congestive heart failure, and also familial hypertrophic cardiomyopathy. | NA |
| NHSL1 | 57224 | NHS like 1 | ENSG00000135540 | NA | NA |
| EGR2 | 1959 | early growth response 2 | ENSG00000122877 | The protein encoded by this gene is a transcription factor with three tandem C2H2-type zinc fingers. Defects in this gene are associated with Charcot-Marie-Tooth disease type 1D (CMT1D), Charcot-Marie-Tooth disease type 4E (CMT4E), and with Dejerine-Sottas syndrome (DSS). Multiple transcript variants encoding two different isoforms have been found for this gene. | NA |
| HSPB8 | 26353 | heat shock protein family B (small) member 8 | ENSG00000152137 | The protein encoded by this gene belongs to the superfamily of small heat-shock proteins containing a conservative alpha-crystallin domain at the C-terminal part of the molecule. The expression of this gene in induced by estrogen in estrogen receptor-positive breast cancer cells, and this protein also functions as a chaperone in association with Bag3, a stimulator of macroautophagy. Thus, this gene appears to be involved in regulation of cell proliferation, apoptosis, and carcinogenesis, and mutations in this gene have been associated with different neuromuscular diseases, including Charcot-Marie-Tooth disease. | NA |
| SHCBP1 | 79801 | SHC binding and spindle associated 1 | ENSG00000171241 | NA | NA |
| KBTBD8 | 84541 | kelch repeat and BTB domain containing 8 | ENSG00000163376 | NA | NA |
| MTND2P28 | ENSG00000225630 | mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 2 pseudogene 28 | ENSG00000225630 | NA | NA |
| GPR183 | 1880 | G protein-coupled receptor 183 | ENSG00000169508 | This gene was identified by the up-regulation of its expression upon Epstein-Barr virus infection of primary B lymphocytes. This gene is predicted to encode a G protein-coupled receptor that is most closely related to the thrombin receptor. Expression of this gene was detected in B-lymphocyte cell lines and lymphoid tissues but not in T-lymphocyte cell lines or peripheral blood T lymphocytes. The function of this gene is unknown. | NA |
| AC019349.5 | ENSG00000229732 | NA | ENSG00000229732 | NA | NA |
| CELA2A | 63036 | chymotrypsin like elastase family member 2A | ENSG00000142615 | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Like most of the human elastases, elastase 2A is secreted from the pancreas as a zymogen. In other species, elastase 2A has been shown to preferentially cleave proteins after leucine, methionine, and phenylalanine residues. | NA |
| MFAP5 | 8076 | microfibrillar associated protein 5 | ENSG00000197614 | This gene encodes a 25-kD microfibril-associated glycoprotein which is a component of microfibrils of the extracellular matrix. The encoded protein promotes attachment of cells to microfibrils via alpha-V-beta-3 integrin. Deficiency of this gene in mice results in neutropenia. Alternate splicing results in multiple transcript variants encoding different isoforms. | NA |
| HYAL1 | 3373 | hyaluronoglucosaminidase 1 | ENSG00000114378 | This gene encodes a lysosomal hyaluronidase. Hyaluronidases intracellularly degrade hyaluronan, one of the major glycosaminoglycans of the extracellular matrix. Hyaluronan is thought to be involved in cell proliferation, migration and differentiation. This enzyme is active at an acidic pH and is the major hyaluronidase in plasma. Mutations in this gene are associated with mucopolysaccharidosis type IX, or hyaluronidase deficiency. The gene is one of several related genes in a region of chromosome 3p21.3 associated with tumor suppression. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| RCSD1 | 92241 | RCSD domain containing 1 | ENSG00000198771 | NA | NA |
| MTND1P23 | ENSG00000225972 | mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 1 pseudogene 23 | ENSG00000225972 | NA | NA |
| HK3 | 3101 | hexokinase 3 | ENSG00000160883 | Hexokinases phosphorylate glucose to produce glucose-6-phosphate, the first step in most glucose metabolism pathways. This gene encodes hexokinase 3. Similar to hexokinases 1 and 2, this allosteric enzyme is inhibited by its product glucose-6-phosphate. | NA |
| LIPF | 8513 | lipase F, gastric type | ENSG00000182333 | This gene encodes gastric lipase, an enzyme involved in the digestion of dietary triglycerides in the gastrointestinal tract, and responsible for 30% of fat digestion processes occurring in human. It is secreted by gastric chief cells in the fundic mucosa of the stomach, and it hydrolyzes the ester bonds of triglycerides under acidic pH conditions. The gene is a member of a conserved gene family of lipases that play distinct roles in neutral lipid metabolism. Several transcript variants encoding different isoforms have been found for this gene. | NA |
| GPR137B | 7107 | G protein-coupled receptor 137B | ENSG00000077585 | NA | NA |
| OPLAH | 26873 | 5-oxoprolinase (ATP-hydrolysing) | ENSG00000178814 | The protein encoded by this gene acts as a homodimer, using ATP hydrolysis to catalyze the conversion of 5-oxo-L-proline to L-glutamate. Defects in this gene are a cause of 5-oxoprolinase deficiency (OPLAHD). | NA |
| PCOLCE2 | 26577 | procollagen C-endopeptidase enhancer 2 | ENSG00000163710 | NA | NA |
| ADAM23 | 8745 | ADAM metallopeptidase domain 23 | ENSG00000114948 | This gene encodes a member of the ADAM (a disintegrin and metalloprotease domain) family. Members of this family are membrane-anchored proteins structurally related to snake venom disintegrins and have been implicated in a variety of biological processes involving cell-cell and cell-matrix interactions, including fertilization, muscle development, and neurogenesis. It is reported that inactivation of this gene is associated with tumorigenesis in human cancers. | NA |
| KCNIP2 | 30819 | potassium voltage-gated channel interacting protein 2 | ENSG00000120049 | This gene encodes a member of the family of voltage-gated potassium (Kv) channel-interacting proteins (KCNIPs), which belongs to the recoverin branch of the EF-hand superfamily. Members of the KCNIP family are small calcium binding proteins. They all have EF-hand-like domains, and differ from each other in the N-terminus. They are integral subunit components of native Kv4 channel complexes. They may regulate A-type currents, and hence neuronal excitability, in response to changes in intracellular calcium. Multiple alternatively spliced transcript variants encoding distinct isoforms have been identified from this gene. | NA |
| SCOC-AS1 | 100129858 | SCOC antisense RNA 1 | ENSG00000196951 | NA | NA |
| TNXB | 7148 | tenascin XB | ENSG00000168477 | This gene encodes a member of the tenascin family of extracellular matrix glycoproteins. The tenascins have anti-adhesive effects, as opposed to fibronectin which is adhesive. This protein is thought to function in matrix maturation during wound healing, and its deficiency has been associated with the connective tissue disorder Ehlers-Danlos syndrome. This gene localizes to the major histocompatibility complex (MHC) class III region on chromosome 6. It is one of four genes in this cluster which have been duplicated. The duplicated copy of this gene is incomplete and is a pseudogene which is transcribed but does not encode a protein. The structure of this gene is unusual in that it overlaps the CREBL1 and CYP21A2 genes at its 5’ and 3’ ends, respectively. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| KRT2 | 3849 | keratin 2 | ENSG00000172867 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | NA |
| THBS4 | 7060 | thrombospondin 4 | ENSG00000113296 | The protein encoded by this gene belongs to the thrombospondin protein family. Thrombospondin family members are adhesive glycoproteins that mediate cell-to-cell and cell-to-matrix interactions. This protein forms a pentamer and can bind to heparin and calcium. It is involved in local signaling in the developing and adult nervous system, and it contributes to spinal sensitization and neuropathic pain states. This gene is activated during the stromal response to invasive breast cancer. It may also play a role in inflammatory responses in Alzheimer’s disease. Alternative splicing results in multiple transcript variants. | NA |
| JAM2 | 58494 | junctional adhesion molecule 2 | ENSG00000154721 | This gene belongs to the immunoglobulin superfamily, and the junctional adhesion molecule (JAM) family. The protein encoded by this gene is a type I membrane protein that is localized at the tight junctions of both epithelial and endothelial cells. It acts as an adhesive ligand for interacting with a variety of immune cell types, and may play a role in lymphocyte homing to secondary lymphoid organs. Alternatively spliced transcript variants have been found for this gene. | NA |
| APOBEC2 | 10930 | apolipoprotein B mRNA editing enzyme catalytic subunit 2 | ENSG00000124701 | NA | NA |
| CRNN | 49860 | cornulin | ENSG00000143536 | This gene encodes a member of the ‘fused gene’ family of proteins, which contain N-terminus EF-hand domains and multiple tandem peptide repeats. The encoded protein contains two EF-hand Ca2+ binding domains in its N-terminus and two glutamine- and threonine-rich 60 amino acid repeats in its C-terminus. This gene, also known as squamous epithelial heat shock protein 53, may play a role in the mucosal/epithelial immune response and epidermal differentiation. | NA |
| PLPP7 | 84814 | phospholipid phosphatase 7 (inactive) | ENSG00000160539 | NA | NA |
| LOC101928718 | 101928718 | uncharacterized LOC101928718 | ENSG00000197852 | NA | NA |
| FAM212B | 55924 | family with sequence similarity 212 member B | ENSG00000197852 | NA | NA |
| COLGALT2 | 23127 | collagen beta(1-O)galactosyltransferase 2 | ENSG00000198756 | NA | NA |
| ANKRD9 | 122416 | ankyrin repeat domain 9 | ENSG00000156381 | NA | NA |
| NA | NA | NA | ENSG00000204794 | NA | TRUE |
| PKIA | 5569 | protein kinase (cAMP-dependent, catalytic) inhibitor alpha | ENSG00000171033 | The protein encoded by this gene is a member of the cAMP-dependent protein kinase (PKA) inhibitor family. This protein was demonstrated to interact with and inhibit the activities of both C alpha and C beta catalytic subunits of the PKA. Alternatively spliced transcript variants encoding the same protein have been reported. | NA |
| CPNE5 | 57699 | copine 5 | ENSG00000124772 | Calcium-dependent membrane-binding proteins may regulate molecular events at the interface of the cell membrane and cytoplasm. This gene is one of several genes that encode a calcium-dependent protein containing two N-terminal type II C2 domains and an integrin A domain-like sequence in the C-terminus. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene. More variants may exist, but their full-length natures could not be determined. | NA |
| TMEM176B | 28959 | transmembrane protein 176B | ENSG00000106565 | NA | NA |
| SPX | 80763 | spexin hormone | ENSG00000134548 | The protein encoded by this gene is a hormone involved in modulation of cardiovascular and renal function. It has also been shown in rats to cause weight loss. Several transcript variants have been found for this gene. | NA |
| NKD2 | 85409 | naked cuticle homolog 2 | ENSG00000145506 | This gene encodes a member of a family of proteins that function as negative regulators of Wnt receptor signaling through interaction with Dishevelled family members. The encoded protein participates in the delivery of transforming growth factor alpha-containing vesicles to the cell membrane. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| CMYA5 | 202333 | cardiomyopathy associated 5 | ENSG00000164309 | NA | NA |
| P2RX7 | 5027 | purinergic receptor P2X 7 | ENSG00000089041 | The product of this gene belongs to the family of purinoceptors for ATP. This receptor functions as a ligand-gated ion channel and is responsible for ATP-dependent lysis of macrophages through the formation of membrane pores permeable to large molecules. Activation of this nuclear receptor by ATP in the cytoplasm may be a mechanism by which cellular activity can be coupled to changes in gene expression. Multiple alternatively spliced variants have been identified, most of which fit nonsense-mediated decay (NMD) criteria. | NA |
| KIF1A | 547 | kinesin family member 1A | ENSG00000130294 | The protein encoded by this gene is a member of the kinesin family and functions as an anterograde motor protein that transports membranous organelles along axonal microtubules. Mutations at this locus have been associated with spastic paraplegia-30 and hereditary sensory neuropathy IIC. Alternatively spliced transcript variants encoding distinct isoforms have been described. | NA |
| GATB | 5188 | glutamyl-tRNA(Gln) amidotransferase, subunit B | ENSG00000059691 | NA | NA |
| PYGM | 5837 | phosphorylase, glycogen, muscle | ENSG00000068976 | This gene encodes a muscle enzyme involved in glycogenolysis. Highly similar enzymes encoded by different genes are found in liver and brain. Mutations in this gene are associated with McArdle disease (myophosphorylase deficiency), a glycogen storage disease of muscle. Alternative splicing results in multiple transcript variants. | NA |
| RP1-253P7.4 | ENSG00000197815 | NA | ENSG00000197815 | NA | NA |
| TNFRSF12A | 51330 | tumor necrosis factor receptor superfamily member 12A | ENSG00000006327 | NA | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",7,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[8,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | X_id | summary | query | name | notfound |
|---|---|---|---|---|---|
| C10orf10 | 11067 | The expression of this gene is induced by fasting as well as by progesterone. The protein encoded by this gene contains a t-synaptosome-associated protein receptor (SNARE) coiled-coil homology domain and a peroxisomal targeting signal. Production of the encoded protein leads to phosphorylation and activation of the transcription factor ELK1. | ENSG00000165507 | chromosome 10 open reading frame 10 | NA |
| SNHG25 | ENSG00000266402 | NA | ENSG00000266402 | small nucleolar RNA host gene 25 | NA |
| CXCL2 | 2920 | This antimicrobial gene is part of a chemokine superfamily that encodes secreted proteins involved in immunoregulatory and inflammatory processes. The superfamily is divided into four subfamilies based on the arrangement of the N-terminal cysteine residues of the mature peptide. This chemokine, a member of the CXC subfamily, is expressed at sites of inflammation and may suppress hematopoietic progenitor cell proliferation. | ENSG00000081041 | C-X-C motif chemokine ligand 2 | NA |
| NUPR1 | 26471 | NA | ENSG00000176046 | nuclear protein 1, transcriptional regulator | NA |
| HSPD1P1 | ENSG00000213430 | NA | ENSG00000213430 | heat shock protein family D (Hsp60) member 1 pseudogene 1 | NA |
| ERRFI1 | 54206 | ERRFI1 is a cytoplasmic protein whose expression is upregulated with cell growth (Wick et al., 1995 [PubMed 7641805]). It shares significant homology with the protein product of rat gene-33, which is induced during cell stress and mediates cell signaling (Makkinje et al., 2000 [PubMed 10749885]; Fiorentino et al., 2000 [PubMed 11003669]). | ENSG00000116285 | ERBB receptor feedback inhibitor 1 | NA |
| MT1M | 4499 | This gene encodes a member of the metallothionein superfamily, type 1 family. Metallothioneins have a high content of cysteine residues that bind various heavy metals. These genes are transcriptionally regulated by both heavy metals and glucocorticoids. | ENSG00000205364 | metallothionein 1M | NA |
| N4BP2L1 | 90634 | NA | ENSG00000139597 | NEDD4 binding protein 2-like 1 | NA |
| RP11-442H21.2 | ENSG00000269926 | NA | ENSG00000269926 | NA | NA |
| ANGPTL4 | 51129 | This gene encodes a glycosylated, secreted protein containing a C-terminal fibrinogen domain. The encoded protein is induced by peroxisome proliferation activators and functions as a serum hormone that regulates glucose homeostasis, lipid metabolism, and insulin sensitivity. This protein can also act as an apoptosis survival factor for vascular endothelial cells and can prevent metastasis by inhibiting vascular growth and tumor cell invasion. The C-terminal domain may be proteolytically-cleaved from the full-length secreted protein. Decreased expression of this gene has been associated with type 2 diabetes. Alternative splicing results in multiple transcript variants. This gene was previously referred to as ANGPTL2 but has been renamed ANGPTL4. | ENSG00000167772 | angiopoietin like 4 | NA |
| CTH | 1491 | This gene encodes a cytoplasmic enzyme in the trans-sulfuration pathway that converts cystathione derived from methionine into cysteine. Glutathione synthesis in the liver is dependent upon the availability of cysteine. Mutations in this gene cause cystathioninuria. Alternative splicing of this gene results in three transcript variants encoding different isoforms. | ENSG00000116761 | cystathionine gamma-lyase | NA |
| INSIG1 | 3638 | Oxysterols regulate cholesterol homeostasis through the liver X receptor (LXR)- and sterol regulatory element-binding protein (SREBP)-mediated signaling pathways. This gene is an insulin-induced gene. It encodes an endoplasmic reticulum (ER) membrane protein that plays a critical role in regulating cholesterol concentrations in cells. This protein binds to the sterol-sensing domains of SREBP cleavage-activating protein (SCAP) and HMG CoA reductase, and is essential for the sterol-mediated trafficking of the two proteins. Alternatively spliced transcript variants encoding distinct isoforms have been observed. | ENSG00000186480 | insulin induced gene 1 | NA |
| DDIT4 | 54541 | NA | ENSG00000168209 | DNA damage inducible transcript 4 | NA |
| BHLHE40-AS1 | 100507582 | NA | ENSG00000235831 | BHLHE40 antisense RNA 1 | NA |
| L3MBTL4 | 91133 | NA | ENSG00000154655 | l(3)mbt-like 4 (Drosophila) | NA |
| MPP6 | 51678 | Members of the peripheral membrane-associated guanylate kinase (MAGUK) family function in tumor suppression and receptor clustering by forming multiprotein complexes containing distinct sets of transmembrane, cytoskeletal, and cytoplasmic signaling proteins. All MAGUKs contain a PDZ-SH3-GUK core and are divided into 4 subfamilies, DLG-like (see DLG1; MIM 601014), ZO1-like (see TJP1; MIM 601009), p55-like (see MPP1; MIM 305360), and LIN2-like (see CASK; MIM 300172), based on their size and the presence of additional domains. MPP6 is a member of the p55-like MAGUK subfamily (Tseng et al., 2001 [PubMed 11311936]). | ENSG00000105926 | membrane palmitoylated protein 6 | NA |
| PHGDH | 26227 | This gene encodes the enzyme which is involved in the early steps of L-serine synthesis in animal cells. L-serine is required for D-serine and other amino acid synthesis. The enzyme requires NAD/NADH as a cofactor and forms homotetramers for activity. Mutations in this gene have been found in a family with congenital microcephaly, psychomotor retardation and other symptoms. Multiple alternatively spliced transcript variants have been found, however the full-length nature of most are not known. | ENSG00000092621 | phosphoglycerate dehydrogenase | NA |
| SNORA32 | 692063 | NA | ENSG00000206799 | small nucleolar RNA, H/ACA box 32 | NA |
| RGS1 | 5996 | This gene encodes a member of the regulator of G-protein signalling family. This protein is located on the cytosolic side of the plasma membrane and contains a conserved, 120 amino acid motif called the RGS domain. The protein attenuates the signalling activity of G-proteins by binding to activated, GTP-bound G alpha subunits and acting as a GTPase activating protein (GAP), increasing the rate of conversion of the GTP to GDP. This hydrolysis allows the G alpha subunits to bind G beta/gamma subunit heterodimers, forming inactive G-protein heterotrimers, thereby terminating the signal. | ENSG00000090104 | regulator of G-protein signaling 1 | NA |
| AC010761.9 | ENSG00000265474 | NA | ENSG00000265474 | NA | NA |
| SPOCK1 | 6695 | This gene encodes the protein core of a seminal plasma proteoglycan containing chondroitin- and heparan-sulfate chains. The protein’s function is unknown, although similarity to thyropin-type cysteine protease-inhibitors suggests its function may be related to protease inhibition. | ENSG00000152377 | sparc/osteonectin, cwcv and kazal-like domains proteoglycan (testican) 1 | NA |
| PAWR | 5074 | The tumor suppressor WT1 represses and activates transcription. The protein encoded by this gene is a WT1-interacting protein that itself functions as a transcriptional repressor. It contains a putative leucine zipper domain which interacts with the zinc finger DNA binding domain of WT1. This protein is specifically upregulated during apoptosis of prostate cells. | ENSG00000177425 | pro-apoptotic WT1 regulator | NA |
| RPL35P1 | ENSG00000237991 | NA | ENSG00000237991 | ribosomal protein L35 pseudogene 1 | NA |
| CX3CL1 | 6376 | NA | ENSG00000006210 | C-X3-C motif chemokine ligand 1 | NA |
| HSPD1 | 3329 | This gene encodes a member of the chaperonin family. The encoded mitochondrial protein may function as a signaling molecule in the innate immune system. This protein is essential for the folding and assembly of newly imported proteins in the mitochondria. This gene is adjacent to a related family member and the region between the 2 genes functions as a bidirectional promoter. Several pseudogenes have been associated with this gene. Two transcript variants encoding the same protein have been identified for this gene. Mutations associated with this gene cause autosomal recessive spastic paraplegia 13. | ENSG00000144381 | heat shock protein family D (Hsp60) member 1 | NA |
| AC010761.10 | ENSG00000265840 | NA | ENSG00000265840 | NA | NA |
| STX17-AS1 | 441461 | NA | ENSG00000255145 | STX17 antisense RNA 1 | NA |
| RP5-1112D6.7 | ENSG00000271789 | NA | ENSG00000271789 | NA | NA |
| NSUN6 | 221078 | NA | ENSG00000241058 | NOP2/Sun RNA methyltransferase family member 6 | NA |
| TSPAN12 | 23554 | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. | ENSG00000106025 | tetraspanin 12 | NA |
| AC079922.2 | ENSG00000231747 | NA | ENSG00000231747 | NA | NA |
| RPL9P32 | ENSG00000242100 | NA | ENSG00000242100 | ribosomal protein L9 pseudogene 32 | NA |
| UGDH | 7358 | The protein encoded by this gene converts UDP-glucose to UDP-glucuronate and thereby participates in the biosynthesis of glycosaminoglycans such as hyaluronan, chondroitin sulfate, and heparan sulfate. These glycosylated compounds are common components of the extracellular matrix and likely play roles in signal transduction, cell migration, and cancer growth and metastasis. The expression of this gene is up-regulated by transforming growth factor beta and down-regulated by hypoxia. Alternative splicing results in multiple transcript variants. | ENSG00000109814 | UDP-glucose 6-dehydrogenase | NA |
| RP11-42O15.3 | ENSG00000271992 | NA | ENSG00000271992 | NA | NA |
| TRAF4 | 9618 | This gene encodes a member of the TNF receptor associated factor (TRAF) family. TRAF proteins are associated with, and mediate signal transduction from members of the TNF receptor superfamily. The encoded protein has been shown to interact with neurotrophin receptor, p75 (NTR/NTSR1), and negatively regulate NTR induced cell death and NF-kappa B activation. This protein has been found to bind to p47phox, a cytosolic regulatory factor included in a multi-protein complex known as NAD(P)H oxidase. This protein thus, is thought to be involved in the oxidative activation of MAPK8/JNK. Alternatively spliced transcript variants have been observed but the full-length nature of only one has been determined. | ENSG00000076604 | TNF receptor associated factor 4 | NA |
| HSPE1 | 3336 | This gene encodes a major heat shock protein which functions as a chaperonin. Its structure consists of a heptameric ring which binds to another heat shock protein in order to form a symmetric, functional heterodimer which enhances protein folding in an ATP-dependent manner. This gene and its co-chaperonin, HSPD1, are arranged in a head-to-head orientation on chromosome 2. Naturally occurring read-through transcription occurs between this locus and the neighboring locus MOBKL3. | ENSG00000115541 | heat shock protein family E (Hsp10) member 1 | NA |
| NADK2 | 133686 | This gene encodes a mitochondrial kinase that catalyzes the phosphorylation of NAD to yield NADP. Mutations in this gene result in 2,4-dienoyl-CoA reductase deficiency. Alternative splicing results in multiple transcript variants. | ENSG00000152620 | NAD kinase 2, mitochondrial | NA |
| NA | NA | NA | ENSG00000261280 | NA | TRUE |
| PIGHP1 | ENSG00000259657 | NA | ENSG00000259657 | phosphatidylinositol glycan anchor biosynthesis class H pseudogene 1 | NA |
| LINC01473 | 101927217 | NA | ENSG00000237877 | long intergenic non-protein coding RNA 1473 | NA |
| IL6 | 3569 | This gene encodes a cytokine that functions in inflammation and the maturation of B cells. In addition, the encoded protein has been shown to be an endogenous pyrogen capable of inducing fever in people with autoimmune diseases or infections. The protein is primarily produced at sites of acute and chronic inflammation, where it is secreted into the serum and induces a transcriptional inflammatory response through interleukin 6 receptor, alpha. The functioning of this gene is implicated in a wide variety of inflammation-associated disease states, including suspectibility to diabetes mellitus and systemic juvenile rheumatoid arthritis. Alternative splicing results in multiple transcript variants. | ENSG00000136244 | interleukin 6 | NA |
| RP11-54O7.14 | ENSG00000242590 | NA | ENSG00000242590 | NA | NA |
| DFNB59 | 494513 | The protein encoded by this gene is a member of the gasdermin family, a family which is found only in vertebrates. The encoded protein is required for the proper function of auditory pathway neurons. Defects in this gene are a cause of non-syndromic sensorineural deafness autosomal recessive type 59 (DFNB59). | ENSG00000204311 | deafness, autosomal recessive 59 | NA |
| AC009404.2 | ENSG00000236255 | NA | ENSG00000236255 | NA | NA |
| AOX1 | 316 | Aldehyde oxidase produces hydrogen peroxide and, under certain conditions, can catalyze the formation of superoxide. Aldehyde oxidase is a candidate gene for amyotrophic lateral sclerosis. | ENSG00000138356 | aldehyde oxidase 1 | NA |
| RP11-83J16.1 | ENSG00000231409 | NA | ENSG00000231409 | NA | NA |
| RPL35P5 | ENSG00000225573 | NA | ENSG00000225573 | ribosomal protein L35 pseudogene 5 | NA |
| HESX1 | 8820 | This gene encodes a conserved homeobox protein that is a transcriptional repressor in the developing forebrain and pituitary gland. Mutations in this gene are associated with septooptic dysplasia, HESX1-related growth hormone deficiency, and combined pituitary hormone deficiency. | ENSG00000163666 | HESX homeobox 1 | NA |
| RARRES1 | 5918 | This gene was identified as a retinoid acid (RA) receptor-responsive gene. It encodes a type 1 membrane protein. The expression of this gene is upregulated by tazarotene as well as by retinoic acid receptors. The expression of this gene is found to be downregulated in prostate cancer, which is caused by the methylation of its promoter and CpG island. Alternatively spliced transcript variant encoding distinct isoforms have been observed. | ENSG00000118849 | retinoic acid receptor responder 1 | NA |
| C8orf4 | 56892 | This gene encodes a small, monomeric, predominantly unstructured protein that functions as a positive regulator of the Wnt/beta-catenin signaling pathway. This protein interacts with a repressor of beta-catenin mediated transcription at nuclear speckles. It is thought to competitively block interactions of the repressor with beta-catenin, resulting in up-regulation of beta-catenin target genes. The encoded protein may also play a role in the NF-kappaB and ERK1/2 signaling pathways. Expression of this gene may play a role in the proliferation of several types of cancer including thyroid cancer, breast cancer and hematological malignancies. | ENSG00000176907 | chromosome 8 open reading frame 4 | NA |
| NOTCH4 | 4855 | This gene encodes a member of the NOTCH family of proteins. Members of this Type I transmembrane protein family share structural characteristics including an extracellular domain consisting of multiple epidermal growth factor-like (EGF) repeats, and an intracellular domain consisting of multiple different domain types. Notch signaling is an evolutionarily conserved intercellular signaling pathway that regulates interactions between physically adjacent cells through binding of Notch family receptors to their cognate ligands. The encoded preproprotein is proteolytically processed in the trans-Golgi network to generate two polypeptide chains that heterodimerize to form the mature cell-surface receptor. This receptor may play a role in vascular, renal and hepatic development. Mutations in this gene may be associated with schizophrenia. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that is proteolytically processed. | ENSG00000204301 | notch 4 | NA |
| EGR2 | 1959 | The protein encoded by this gene is a transcription factor with three tandem C2H2-type zinc fingers. Defects in this gene are associated with Charcot-Marie-Tooth disease type 1D (CMT1D), Charcot-Marie-Tooth disease type 4E (CMT4E), and with Dejerine-Sottas syndrome (DSS). Multiple transcript variants encoding two different isoforms have been found for this gene. | ENSG00000122877 | early growth response 2 | NA |
| NA | NA | NA | ENSG00000269165 | NA | TRUE |
| CKS1B | 1163 | CKS1B protein binds to the catalytic subunit of the cyclin dependent kinases and is essential for their biological function. The CKS1B mRNA is found to be expressed in different patterns through the cell cycle in HeLa cells, which reflects a specialized role for the encoded protein. At least two transcript variants have been identified for this gene, and it appears that only one of them encodes a protein. | ENSG00000173207 | CDC28 protein kinase regulatory subunit 1B | NA |
| DHODH | 1723 | The protein encoded by this gene catalyzes the fourth enzymatic step, the ubiquinone-mediated oxidation of dihydroorotate to orotate, in de novo pyrimidine biosynthesis. This protein is a mitochondrial protein located on the outer surface of the inner mitochondrial membrane. | ENSG00000102967 | dihydroorotate dehydrogenase (quinone) | NA |
| EIF4EBP1 | 1978 | This gene encodes one member of a family of translation repressor proteins. The protein directly interacts with eukaryotic translation initiation factor 4E (eIF4E), which is a limiting component of the multisubunit complex that recruits 40S ribosomal subunits to the 5’ end of mRNAs. Interaction of this protein with eIF4E inhibits complex assembly and represses translation. This protein is phosphorylated in response to various signals including UV irradiation and insulin signaling, resulting in its dissociation from eIF4E and activation of mRNA translation. | ENSG00000187840 | eukaryotic translation initiation factor 4E binding protein 1 | NA |
| GCH1 | 2643 | This gene encodes a member of the GTP cyclohydrolase family. The encoded protein is the first and rate-limiting enzyme in tetrahydrobiopterin (BH4) biosynthesis, catalyzing the conversion of GTP into 7,8-dihydroneopterin triphosphate. BH4 is an essential cofactor required by aromatic amino acid hydroxylases as well as nitric oxide synthases. Mutations in this gene are associated with malignant hyperphenylalaninemia and dopa-responsive dystonia. Several alternatively spliced transcript variants encoding different isoforms have been described; however, not all variants give rise to a functional enzyme. | ENSG00000131979 | GTP cyclohydrolase 1 | NA |
| SMO | 6608 | The protein encoded by this gene is a G protein-coupled receptor that interacts with the patched protein, a receptor for hedgehog proteins. The encoded protein tranduces signals to other proteins after activation by a hedgehog protein/patched protein complex. | ENSG00000128602 | smoothened, frizzled class receptor | NA |
| MMAB | 326625 | This gene encodes a protein that catalyzes the final step in the conversion of vitamin B(12) into adenosylcobalamin (AdoCbl), a vitamin B12-containing coenzyme for methylmalonyl-CoA mutase. Mutations in the gene are the cause of vitamin B12-dependent methylmalonic aciduria linked to the cblB complementation group. Alternatively spliced transcript variants have been found. | ENSG00000139428 | methylmalonic aciduria (cobalamin deficiency) cblB type | NA |
| GJA4 | 2701 | This gene encodes a member of the connexin gene family. The encoded protein is a component of gap junctions, which are composed of arrays of intercellular channels that provide a route for the diffusion of low molecular weight materials from cell to cell. Mutations in this gene have been associated with atherosclerosis and a higher risk of myocardial infarction. | ENSG00000187513 | gap junction protein alpha 4 | NA |
| RPS7P3 | ENSG00000231940 | NA | ENSG00000231940 | ribosomal protein S7 pseudogene 3 | NA |
| RP11-592N21.1 | ENSG00000212664 | NA | ENSG00000212664 | NA | NA |
| PPP1R14B | 26472 | NA | ENSG00000173457 | protein phosphatase 1 regulatory inhibitor subunit 14B | NA |
| HSD17B7 | 51478 | HSD17B7 encodes an enzyme that functions both as a 17-beta-hydroxysteroid dehydrogenase (EC 1.1.1.62) in the biosynthesis of sex steroids and as a 3-ketosteroid reductase (EC 1.1.1.270) in the biosynthesis of cholesterol (Marijanovic et al., 2003 [PubMed 12829805]). | ENSG00000132196 | hydroxysteroid 17-beta dehydrogenase 7 | NA |
| AFMID | 125061 | NA | ENSG00000183077 | arylformamidase | NA |
| SLC43A1 | 8501 | SLC43A1 belongs to the system L family of plasma membrane carrier proteins that transports large neutral amino acids (Babu et al., 2003 [PubMed 12930836]). | ENSG00000149150 | solute carrier family 43 member 1 | NA |
| ZFP36L1 | 677 | This gene is a member of the TIS11 family of early response genes, which are induced by various agonists such as the phorbol ester TPA and the polypeptide mitogen EGF. This gene is well conserved across species and has a promoter that contains motifs seen in other early-response genes. The encoded protein contains a distinguishing putative zinc finger domain with a repeating cys-his motif. This putative nuclear transcription factor most likely functions in regulating the response to growth factors. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ENSG00000185650 | ZFP36 ring finger protein-like 1 | NA |
| IFITM4P | ENSG00000235821 | NA | ENSG00000235821 | interferon induced transmembrane protein 4 pseudogene | NA |
| CTC-301O7.4 | ENSG00000197813 | NA | ENSG00000197813 | NA | NA |
| CRYZ | 1429 | Crystallins are separated into two classes: taxon-specific, or enzyme, and ubiquitous. The latter class constitutes the major proteins of vertebrate eye lens and maintains the transparency and refractive index of the lens. The former class is also called phylogenetically-restricted crystallins. This gene encodes a taxon-specific crystallin protein which has NADPH-dependent quinone reductase activity distinct from other known quinone reductases. It lacks alcohol dehydrogenase activity although by similarity it is considered a member of the zinc-containing alcohol dehydrogenase family. Unlike other mammalian species, in humans, lens expression is low. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. One pseudogene is known to exist. | ENSG00000116791 | crystallin zeta | NA |
| RPL17P50 | ENSG00000213700 | NA | ENSG00000213700 | ribosomal protein L17 pseudogene 50 | NA |
| RPS2P48 | ENSG00000233380 | NA | ENSG00000233380 | ribosomal protein S2 pseudogene 48 | NA |
| NA | NA | NA | ENSG00000273097 | NA | TRUE |
| TOP1MT | 116447 | This gene encodes a mitochondrial DNA topoisomerase that plays a role in the modification of DNA topology. The encoded protein is a type IB topoisomerase and catalyzes the transient breaking and rejoining of DNA to relieve tension and DNA supercoiling generated in the mitochondrial genome during replication and transcription. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | ENSG00000184428 | topoisomerase (DNA) I, mitochondrial | NA |
| NA | NA | NA | ENSG00000269999 | NA | TRUE |
| ELL2P1 | ENSG00000227295 | NA | ENSG00000227295 | elongation factor, RNA polymerase II, 2 pseudogene 1 | NA |
| ASS1P2 | ENSG00000223922 | NA | ENSG00000223922 | argininosuccinate synthetase 1 pseudogene 2 | NA |
| OLMALINC | ENSG00000235823 | NA | ENSG00000235823 | oligodendrocyte maturation-associated long intergenic non-coding RNA | NA |
| RP11-727A23.4 | ENSG00000254676 | NA | ENSG00000254676 | NA | NA |
| PA2G4P4 | ENSG00000230457 | NA | ENSG00000230457 | proliferation-associated 2G4 pseudogene 4 | NA |
| RCL1 | 10171 | NA | ENSG00000120158 | RNA terminal phosphate cyclase like 1 | NA |
| HSPA4L | 22824 | The protein encoded by this gene is heat shock inducible and may act as a chaperone. The encoded protein can protect the heat-shocked cell against the harmful effects of aggregated proteins. This gene is highly expressed in leukemia cells and may be a good target for therapeutic intervention. Several transcripts encoding different isoforms have been found for this gene. | ENSG00000164070 | heat shock protein family A (Hsp70) member 4 like | NA |
| GRHL1 | 29841 | This gene encodes a member of the grainyhead family of transcription factors. The encoded protein can exist as a homodimer or can form heterodimers with sister-of-mammalian grainyhead or brother-of-mammalian grainyhead. This protein functions as a transcription factor during development. | ENSG00000134317 | grainyhead like transcription factor 1 | NA |
| RP11-799B12.2 | ENSG00000264924 | NA | ENSG00000264924 | NA | NA |
| CKS1BP3 | ENSG00000268942 | NA | ENSG00000268942 | CDC28 protein kinase regulatory subunit 1B pseudogene 3 | NA |
| HSPE1P2 | ENSG00000258645 | NA | ENSG00000258645 | heat shock protein family E (Hsp10) member 1 pseudogene 2 | NA |
| SIX5 | 147912 | The protein encoded by this gene is a homeodomain-containing transcription factor that appears to function in the regulation of organogenesis. This gene is located downstream of the dystrophia myotonica-protein kinase gene. Mutations in this gene are a cause of branchiootorenal syndrome type 2. | ENSG00000177045 | SIX homeobox 5 | NA |
| GAS2 | 2620 | The protein encoded by this gene is a caspase-3 substrate that plays a role in regulating microfilament and cell shape changes during apoptosis. It can also modulate cell susceptibility to p53-dependent apoptosis by inhibiting calpain activity. Multiple alternatively spliced variants, encoding the same protein, have been identified. | ENSG00000148935 | growth arrest specific 2 | NA |
| RP11-513G19.1 | ENSG00000255968 | NA | ENSG00000255968 | NA | NA |
| CTD-2031P19.4 | ENSG00000264281 | NA | ENSG00000264281 | NA | NA |
| RP11-70L8.4 | ENSG00000265194 | NA | ENSG00000265194 | NA | NA |
| MGMT | 4255 | Alkylating agents are potent carcinogens that can result in cell death, mutation and cancer. The protein encoded by this gene is a DNA repair protein that is involved in cellular defense against mutagenesis and toxicity from alkylating agents. The protein catalyzes transfer of methyl groups from O(6)-alkylguanine and other methylated moieties of the DNA to its own molecule, which repairs the toxic lesions. Methylation of the genes promoter has been associated with several cancer types, including colorectal cancer, lung cancer, lymphoma and glioblastoma. | ENSG00000170430 | O-6-methylguanine-DNA methyltransferase | NA |
| GLS2 | 27165 | The protein encoded by this gene is a mitochondrial phosphate-activated glutaminase that catalyzes the hydrolysis of glutamine to stoichiometric amounts of glutamate and ammonia. Originally thought to be liver-specific, this protein has been found in other tissues as well. Alternative splicing results in multiple transcript variants that encode different isoforms. | ENSG00000135423 | glutaminase 2 | NA |
| TTC39C | 125488 | NA | ENSG00000168234 | tetratricopeptide repeat domain 39C | NA |
| PDE8A | 5151 | The protein encoded by this gene belongs to the cyclic nucleotide phosphodiesterase (PDE) family, and PDE8 subfamily. This PDE hydrolyzes the second messenger, cAMP, which is a regulator and mediator of a number of cellular responses to extracellular signals. Thus, by regulating the cellular concentration of cAMP, this protein plays a key role in many important physiological processes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ENSG00000073417 | phosphodiesterase 8A | NA |
| NFIB | 4781 | NA | ENSG00000147862 | nuclear factor I B | NA |
| MICAL2 | 9645 | NA | ENSG00000133816 | microtubule associated monooxygenase, calponin and LIM domain containing 2 | NA |
| STK3 | 6788 | This gene encodes a serine/threonine protein kinase activated by proapoptotic molecules indicating the encoded protein functions as a growth suppressor. Cleavage of the protein product by caspase removes the inhibitory C-terminal portion. The N-terminal portion is transported to the nucleus where it homodimerizes to form the active kinase which promotes the condensation of chromatin during apoptosis. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000104375 | serine/threonine kinase 3 | NA |
| GAMT | 2593 | The protein encoded by this gene is a methyltransferase that converts guanidoacetate to creatine, using S-adenosylmethionine as the methyl donor. Defects in this gene have been implicated in neurologic syndromes and muscular hypotonia, probably due to creatine deficiency and accumulation of guanidinoacetate in the brain of affected individuals. Two transcript variants encoding different isoforms have been described for this gene. Pseudogenes of this gene are found on chromosomes 2 and 13. | ENSG00000130005 | guanidinoacetate N-methyltransferase | NA |
| PTP4A1 | 7803 | This gene encodes a member of a small class of prenylated protein tyrosine phosphatases (PTPs), which contain a PTP domain and a characteristic C-terminal prenylation motif. The encoded protein is a cell signaling molecule that plays regulatory roles in a variety of cellular processes, including cell proliferation and migration. The protein may also be involved in cancer development and metastasis. This tyrosine phosphatase is a nuclear protein, but may associate with plasma membrane by means of its prenylation motif. Pseudogenes related to this gene are located on chromosomes 1, 2, 5, 7, 11 and X. | ENSG00000112245 | protein tyrosine phosphatase type IVA, member 1 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",8,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[9,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| summary | X_id | symbol | name | query | notfound |
|---|---|---|---|---|---|
| This gene is a main control point for the regulation of gluconeogenesis. The cytosolic enzyme encoded by this gene, along with GTP, catalyzes the formation of phosphoenolpyruvate from oxaloacetate, with the release of carbon dioxide and GDP. The expression of this gene can be regulated by insulin, glucocorticoids, glucagon, cAMP, and diet. Defects in this gene are a cause of cytosolic phosphoenolpyruvate carboxykinase deficiency. A mitochondrial isozyme of the encoded protein also has been characterized. | 5105 | PCK1 | phosphoenolpyruvate carboxykinase 1 | ENSG00000124253 | NA |
| This gene encodes the heavy chain subunit of the pre-alpha-trypsin inhibitor complex. This complex may stabilize the extracellular matrix through its ability to bind hyaluronic acid. Polymorphisms of this gene may be associated with increased risk for schizophrenia and major depressive disorder. This gene is present in an inter-alpha-trypsin inhibitor family gene cluster on chromosome 3. | 3699 | ITIH3 | inter-alpha-trypsin inhibitor heavy chain 3 | ENSG00000162267 | NA |
| This gene encodes a plasma glycoprotein that binds heme with high affinity. The encoded protein is an acute phase protein that transports heme from the plasma to the liver and may be involved in protecting cells from oxidative stress. | 3263 | HPX | hemopexin | ENSG00000110169 | NA |
| This gene encodes a glycoprotein with an approximate molecular weight of 76.5 kDa. It is thought to have been created as a result of an ancient gene duplication event that led to generation of homologous C and N-terminal domains each of which binds one ion of ferric iron. The function of this protein is to transport iron from the intestine, reticuloendothelial system, and liver parenchymal cells to all proliferating cells in the body. This protein may also have a physiologic role as granulocyte/pollen-binding protein (GPBP) involved in the removal of certain organic matter and allergens from serum. | 7018 | TF | transferrin | ENSG00000091513 | NA |
| The mitochondrial enzyme encoded by this gene catalyzes synthesis of carbamoyl phosphate from ammonia and bicarbonate. This reaction is the first committed step of the urea cycle, which is important in the removal of excess urea from cells. The encoded protein may also represent a core mitochondrial nucleoid protein. Three transcript variants encoding different isoforms have been found for this gene. The shortest isoform may not be localized to the mitochondrion. Mutations in this gene have been associated with carbamoyl phosphate synthetase deficiency, susceptibility to persistent pulmonary hypertension, and susceptibility to venoocclusive disease after bone marrow transplantation. | 1373 | CPS1 | carbamoyl-phosphate synthase 1 | ENSG00000021826 | NA |
| This antimicrobial gene belongs to the cytokine gene family which encode secreted proteins involved in immunoregulatory and inflammatory processes. The protein encoded by this gene is structurally related to the CXC (Cys-X-Cys) subfamily of cytokines. Members of this subfamily are characterized by two cysteines separated by a single amino acid. This cytokine displays chemotactic activity for monocytes but not for lymphocytes, dendritic cells, neutrophils or macrophages. It has been implicated that this cytokine is involved in the homeostasis of monocyte-derived macrophages rather than in inflammation. | 9547 | CXCL14 | C-X-C motif chemokine ligand 14 | ENSG00000145824 | NA |
| NA | 104326055 | APOA1-AS | APOA1 antisense RNA | ENSG00000235910 | NA |
| This gene encodes a member of the apolipoprotein C1 family. This gene is expressed primarily in the liver, and it is activated when monocytes differentiate into macrophages. The encoded protein plays a central role in high density lipoprotein (HDL) and very low density lipoprotein (VLDL) metabolism. This protein has also been shown to inhibit cholesteryl ester transfer protein in plasma. A pseudogene of this gene is located 4 kb downstream in the same orientation, on the same chromosome. This gene is mapped to chromosome 19, where it resides within a apolipoprotein gene cluster. | 341 | APOC1 | apolipoprotein C1 | ENSG00000130208 | NA |
| The protein encoded by this gene is the beta component of fibrinogen, a blood-borne glycoprotein comprised of three pairs of nonidentical polypeptide chains. Following vascular injury, fibrinogen is cleaved by thrombin to form fibrin which is the most abundant component of blood clots. In addition, various cleavage products of fibrinogen and fibrin regulate cell adhesion and spreading, display vasoconstrictor and chemotactic activities, and are mitogens for several cell types. Mutations in this gene lead to several disorders, including afibrinogenemia, dysfibrinogenemia, hypodysfibrinogenemia and thrombotic tendency. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 2244 | FGB | fibrinogen beta chain | ENSG00000171564 | NA |
| Polyspecific organic cation transporters in the liver, kidney, intestine, and other organs are critical for elimination of many endogenous small organic cations as well as a wide array of drugs and environmental toxins. This gene is one of three similar cation transporter genes located in a cluster on chromosome 6. The encoded protein contains twelve putative transmembrane domains and is a plasma integral membrane protein. Two transcript variants encoding two different isoforms have been found for this gene, but only the longer variant encodes a functional transporter. | 6580 | SLC22A1 | solute carrier family 22 member 1 | ENSG00000175003 | NA |
| The protein encoded by this gene is a metalloprotein that binds most of the copper in plasma and is involved in the peroxidation of Fe(II)transferrin to Fe(III) transferrin. Mutations in this gene cause aceruloplasminemia, which results in iron accumulation and tissue damage, and is associated with diabetes and neurologic abnormalities. Two transcript variants, one protein-coding and the other not protein-coding, have been found for this gene. | 1356 | CP | ceruloplasmin (ferroxidase) | ENSG00000047457 | NA |
| The protein encoded by this gene belongs to the lipocalin family. It is one of the three subunits that constitutes complement component 8 (C8), which is composed of a disulfide-linked C8 alpha-gamma heterodimer and a non-covalently associated C8 beta chain. C8 participates in the formation of the membrane attack complex (MAC) on bacterial cell membranes. While subunits alpha and beta play a role in complement-mediated bacterial killing, the gamma subunit is not required for the bactericidal activity. | 733 | C8G | complement component 8, gamma polypeptide | ENSG00000176919 | NA |
| This gene encodes a protein that contains a ubiquitin associated domain at the N-terminus, an SH3 domain, and a C-terminal domain with similarities to the catalytic motif of phosphoglycerate mutase. The encoded protein was found to inhibit endocytosis of epidermal growth factor receptor (EGFR) and platelet-derived growth factor receptor. | 84959 | UBASH3B | ubiquitin associated and SH3 domain containing B | ENSG00000154127 | NA |
| This gene is one of several genes encoding pulmonary-surfactant associated proteins (SFTPA) located on chromosome 10. Mutations in this gene and a highly similar gene located nearby, which affect the highly conserved carbohydrate recognition domain, are associated with idiopathic pulmonary fibrosis. The current version of the assembly displays only a single centromeric SFTPA gene pair rather than the two gene pairs shown in the previous assembly which were thought to have resulted from a duplication. | 729238 | SFTPA2 | surfactant protein A2 | ENSG00000185303 | NA |
| The protein encoded by this gene is an enzyme in the catabolic pathway of tyrosine. The encoded protein catalyzes the conversion of 4-hydroxyphenylpyruvate to homogentisate. Defects in this gene are a cause of tyrosinemia type 3 (TYRO3) and hawkinsinuria (HAWK). Two transcript variants encoding different isoforms have been found for this gene. | 3242 | HPD | 4-hydroxyphenylpyruvate dioxygenase | ENSG00000158104 | NA |
| This gene encodes a GTPase-activating protein that activates the small guanine-nucleotide-binding protein Rap1 in platelets. The protein interacts with synaptotagmin-like protein 1 and Rab27 and regulates secretion of dense granules from platelets at sites of endothelial damage. Multiple transcript variants encoding different isoforms have been found for this gene. | 23108 | RAP1GAP2 | RAP1 GTPase activating protein 2 | ENSG00000132359 | NA |
| This gene encodes the protein core of a seminal plasma proteoglycan containing chondroitin- and heparan-sulfate chains. The protein’s function is unknown, although similarity to thyropin-type cysteine protease-inhibitors suggests its function may be related to protease inhibition. | 6695 | SPOCK1 | sparc/osteonectin, cwcv and kazal-like domains proteoglycan (testican) 1 | ENSG00000152377 | NA |
| This gene is a member of the septin gene family of nucleotide binding proteins, originally described in yeast as cell division cycle regulatory proteins. Septins are highly conserved in yeast, Drosophila, and mouse and appear to regulate cytoskeletal organization. Disruption of septin function disturbs cytokinesis and results in large multinucleate or polyploid cells. This gene is mapped to 22q11, the region frequently deleted in DiGeorge and velocardiofacial syndromes. A translocation involving the MLL gene and this gene has also been reported in patients with acute myeloid leukemia. Alternative splicing results in multiple transcript variants. The presence of a non-consensus polyA signal (AACAAT) in this gene also results in read-through transcription into the downstream neighboring gene (GP1BB; platelet glycoprotein Ib), whereby larger, non-coding transcripts are produced. | 5413 | SEPT5 | septin 5 | ENSG00000184702 | NA |
| NA | 90139 | TSPAN18 | tetraspanin 18 | ENSG00000157570 | NA |
| NA | 8608 | RDH16 | retinol dehydrogenase 16 (all-trans) | ENSG00000139547 | NA |
| This gene encodes a preproprotein, which is processed to yield both alpha and beta chains, which subsequently combine as a tetramer to produce haptoglobin. Haptoglobin functions to bind free plasma hemoglobin, which allows degradative enzymes to gain access to the hemoglobin, while at the same time preventing loss of iron through the kidneys and protecting the kidneys from damage by hemoglobin. Mutations in this gene and/or its regulatory regions cause ahaptoglobinemia or hypohaptoglobinemia. This gene has also been linked to diabetic nephropathy, the incidence of coronary artery disease in type 1 diabetes, Crohn’s disease, inflammatory disease behavior, primary sclerosing cholangitis, susceptibility to idiopathic Parkinson’s disease, and a reduced incidence of Plasmodium falciparum malaria. The protein encoded also exhibits antimicrobial activity against bacteria. A similar duplicated gene is located next to this gene on chromosome 16. Multiple transcript variants encoding different isoforms have been found for this gene. | 3240 | HP | haptoglobin | ENSG00000257017 | NA |
| NA | 9645 | MICAL2 | microtubule associated monooxygenase, calponin and LIM domain containing 2 | ENSG00000133816 | NA |
| Protein kinase C (PKC) zeta is a member of the PKC family of serine/threonine kinases which are involved in a variety of cellular processes such as proliferation, differentiation and secretion. Unlike the classical PKC isoenzymes which are calcium-dependent, PKC zeta exhibits a kinase activity which is independent of calcium and diacylglycerol but not of phosphatidylserine. Furthermore, it is insensitive to typical PKC inhibitors and cannot be activated by phorbol ester. Unlike the classical PKC isoenzymes, it has only a single zinc finger module. These structural and biochemical properties indicate that the zeta subspecies is related to, but distinct from other isoenzymes of PKC. Alternative splicing results in multiple transcript variants encoding different isoforms. | 5590 | PRKCZ | protein kinase C zeta | ENSG00000067606 | NA |
| NA | ENSG00000269934 | RP5-1139B12.3 | NA | ENSG00000269934 | NA |
| This gene encodes a protein which binds with glycosaminoglycans to form part of the extracellular matrix. The protein contains thyroglobulin type-1, follistatin-like, and calcium-binding domains, and has glycosaminoglycan attachment sites in the acidic C-terminal region. Three alternatively spliced transcript variants that encode different protein isoforms have been described for this gene. | 9806 | SPOCK2 | sparc/osteonectin, cwcv and kazal-like domains proteoglycan (testican) 2 | ENSG00000107742 | NA |
| The protein encoded by this gene is a plasma glycoprotein of unknown function. The protein shows sequence similarity to the variable regions of some immunoglobulin supergene family member proteins. | 1 | A1BG | alpha-1-B glycoprotein | ENSG00000121410 | NA |
| NA | 11123 | RCAN3 | RCAN family member 3 | ENSG00000117602 | NA |
| NA | ENSG00000214425 | LRRC37A4P | leucine-rich repeat containing 37 member A4, pseudogene | ENSG00000214425 | NA |
| The protein encoded by this gene is a member of the alcohol dehydrogenase family. Members of this enzyme family metabolize a wide variety of substrates, including ethanol, retinol, other aliphatic alcohols, hydroxysteroids, and lipid peroxidation products. This encoded protein, consisting of several homo- and heterodimers of alpha, beta, and gamma subunits, exhibits high activity for ethanol oxidation and plays a major role in ethanol catabolism. Three genes encoding alpha, beta and gamma subunits are tandemly organized in a genomic segment as a gene cluster. Two transcript variants encoding different isoforms have been found for this gene. | 125 | ADH1B | alcohol dehydrogenase 1B (class I), beta polypeptide | ENSG00000196616 | NA |
| The protein encoded by this gene is a bifunctional enzyme that channels 1-carbon units from formiminoglutamate, a metabolite of the histidine degradation pathway, to the folate pool. Mutations in this gene are associated with glutamate formiminotransferase deficiency. Alternatively spliced transcript variants have been found for this gene. | 10841 | FTCD | formimidoyltransferase cyclodeaminase | ENSG00000160282 | NA |
| NA | 55908 | ANGPTL8 | angiopoietin like 8 | ENSG00000130173 | NA |
| This gene encodes a nuclear protein with three C2H2-type zinc fingers, and functions as a transcriptional repressor. Chromosomal aberrations involving this gene are associated with endometrial stromal tumors. Alternatively spliced variants which encode different protein isoforms have been described; however, not all variants have been fully characterized | 221895 | JAZF1 | JAZF zinc finger 1 | ENSG00000153814 | NA |
| Tight junctions represent one mode of cell-to-cell adhesion in epithelial or endothelial cell sheets, forming continuous seals around cells and serving as a physical barrier to prevent solutes and water from passing freely through the paracellular space. These junctions are comprised of sets of continuous networking strands in the outwardly facing cytoplasmic leaflet, with complementary grooves in the inwardly facing extracytoplasmic leaflet. The protein encoded by this gene, a member of the claudin family, is an integral membrane protein and a component of tight junction strands. Loss of function mutations result in neonatal ichthyosis-sclerosing cholangitis syndrome. | 9076 | CLDN1 | claudin 1 | ENSG00000163347 | NA |
| The protein encoded by this gene contains a pleckstrin homology (PH) domain and an oxysterol-binding region. It binds oxysterols such as 7-ketocholesterol and may inhibit their cytotoxicity. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 23762 | OSBP2 | oxysterol binding protein 2 | ENSG00000184792 | NA |
| Syntaxin-1, synaptobrevin/VAMP, and SNAP25 interact to form the SNARE complex, which is required for synaptic vesicle docking and fusion. The protein encoded by this gene is membrane-associated and inhibits SNARE complex formation by binding free syntaxin-1. Expression of this gene appears to be brain-specific. Alternative splicing results in multiple transcript variants encoding different isoforms. | 9751 | SNPH | syntaphilin | ENSG00000101298 | NA |
| This gene encodes a member of a family of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferases, which catalyze the transfer of N-acetylgalactosamine (GalNAc) from UDP-GalNAc to a serine or threonine residue on a polypeptide acceptor in the initial step of O-linked protein glycosylation. Mutations in this gene are associated with an increased susceptibility to colorectal cancer. | 79695 | GALNT12 | polypeptide N-acetylgalactosaminyltransferase 12 | ENSG00000119514 | NA |
| NA | 51560 | RAB6B | RAB6B, member RAS oncogene family | ENSG00000154917 | NA |
| This receptor binds insulin-like growth factor with a high affinity. It has tyrosine kinase activity. The insulin-like growth factor I receptor plays a critical role in transformation events. Cleavage of the precursor generates alpha and beta subunits. It is highly overexpressed in most malignant tissues where it functions as an anti-apoptotic agent by enhancing cell survival. Alternatively spliced transcript variants encoding distinct isoforms have been found for this gene. | 3480 | IGF1R | insulin like growth factor 1 receptor | ENSG00000140443 | NA |
| NA | 9764 | KIAA0513 | KIAA0513 | ENSG00000135709 | NA |
| NA | ENSG00000189316 | RP11-797H7.5 | NA | ENSG00000189316 | NA |
| NA | ENSG00000215861 | WI2-1896O14.1 | NA | ENSG00000215861 | NA |
| This gene encodes a protein thought to be a component of the radial spoke head in motile cilia and flagella. Mutations in this gene are associated with primary ciliary dyskinesia 12. Alternative splicing results in multiple transcript variants. | 221421 | RSPH9 | radial spoke head 9 homolog | ENSG00000172426 | NA |
| This protein belongs to the aldehyde dehydrogenase family of proteins. This enzyme is a mitochondrial matrix NAD-dependent dehydrogenase which catalyzes the second step of the proline degradation pathway, converting pyrroline-5-carboxylate to glutamate. Deficiency of this enzyme is associated with type II hyperprolinemia, an autosomal recessive disorder characterized by accumulation of delta-1-pyrroline-5-carboxylate (P5C) and proline. Alternatively spliced transcript variants encoding different isoforms have been identified for this gene. | 8659 | ALDH4A1 | aldehyde dehydrogenase 4 family member A1 | ENSG00000159423 | NA |
| NA | 161145 | TMEM229B | transmembrane protein 229B | ENSG00000198133 | NA |
| The protein encoded by this gene is localized to the nucleus of endothelial cells and is induced by IL-1 and TNF-alpha stimulation. Studies in rat cardiomyocytes suggest that this gene functions as a transcription factor. Interactions between this protein and the sarcomeric proteins myopalladin and titin suggest that it may also be involved in the myofibrillar stretch-sensor system. | 27063 | ANKRD1 | ankyrin repeat domain 1 | ENSG00000148677 | NA |
| NA | 27124 | INPP5J | inositol polyphosphate-5-phosphatase J | ENSG00000185133 | NA |
| NA | 9911 | TMCC2 | transmembrane and coiled-coil domain family 2 | ENSG00000133069 | NA |
| This gene produces alternative transcripts encoding two distinct proteins. One protein is a transcriptional repressor, while the other isoform is a major component of specialized synapses known as synaptic ribbons. Both proteins contain a NAD+ binding domain similar to NAD+-dependent 2-hydroxyacid dehydrogenases. A portion of the 3’ untranslated region was used to map this gene to chromosome 21q21.3; however, it was noted that similar loci elsewhere in the genome are likely. Blast analysis shows that this gene is present on chromosome 10. Several transcript variants encoding two different isoforms have been found for this gene. | 1488 | CTBP2 | C-terminal binding protein 2 | ENSG00000175029 | NA |
| This gene encodes a member of the gap junction protein family. The gap junctions were first characterized by electron microscopy as regionally specialized structures on plasma membranes of contacting adherent cells. These structures were shown to consist of cell-to-cell channels that facilitate the transfer of ions and small molecules between cells. The gap junction proteins, also known as connexins, purified from fractions of enriched gap junctions from different tissues differ. According to sequence similarities at the nucleotide and amino acid levels, the gap junction proteins are divided into two categories, alpha and beta. Mutations in this gene are responsible for as much as 50% of pre-lingual, recessive deafness. | 2706 | GJB2 | gap junction protein beta 2 | ENSG00000165474 | NA |
| NA | 3797 | KIF3C | kinesin family member 3C | ENSG00000084731 | NA |
| NA | ENSG00000254680 | RP11-265D17.2 | NA | ENSG00000254680 | NA |
| IGSF4B is a brain-specific protein related to the calcium-independent cell-cell adhesion molecules known as nectins (see PVRL3; MIM 607147) (Kakunaga et al., 2005 [PubMed 15741237]). | 57863 | CADM3 | cell adhesion molecule 3 | ENSG00000162706 | NA |
| NA | ENSG00000268230 | CTD-2619J13.8 | NA | ENSG00000268230 | NA |
| NA | ENSG00000261172 | RP11-356C4.5 | NA | ENSG00000261172 | NA |
| This gene encodes apolipoprotein A-I, which is the major protein component of high density lipoprotein (HDL) in plasma. The encoded preproprotein is proteolytically processed to generate the mature protein, which promotes cholesterol efflux from tissues to the liver for excretion, and is a cofactor for lecithin cholesterolacyltransferase (LCAT), an enzyme responsible for the formation of most plasma cholesteryl esters. This gene is closely linked with two other apolipoprotein genes on chromosome 11. Defects in this gene are associated with HDL deficiencies, including Tangier disease, and with systemic non-neuropathic amyloidosis. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein. | 335 | APOA1 | apolipoprotein A1 | ENSG00000118137 | NA |
| This gene belongs to the family of reticulon encoding genes. Reticulons are associated with the endoplasmic reticulum, and are involved in neuroendocrine secretion or in membrane trafficking in neuroendocrine cells. This gene is considered to be a specific marker for neurological diseases and cancer, and is a potential molecular target for therapy. Alternative splicing results in multiple transcript variants. | 6252 | RTN1 | reticulon 1 | ENSG00000139970 | NA |
| This gene is specifically expressed in the central nervous system (CNS). It encodes a member of the DOCK (dedicator of cytokinesis) family of guanine nucleotide exchange factors (GEFs). This protein, dedicator of cytokinesis 3 (DOCK3), is also known as modifier of cell adhesion (MOCA) and presenilin-binding protein (PBP). The DOCK3 and DOCK1, -2 and -4 share several conserved amino acids in their DHR-2 (DOCK homology region 2) domains that are required for GEF activity, and bind directly to WAVE proteins [Wiskott-Aldrich syndrome protein (WASP) family Verprolin-homologous proteins] via their DHR-1 domains. The DOCK3 induces axonal outgrowth in CNS by stimulating membrane recruitment of the WAVE complex and activating the small G protein Rac1. This gene is associated with an attention deficit hyperactivity disorder-like phenotype by a complex chromosomal rearrangement. | 1795 | DOCK3 | dedicator of cytokinesis 3 | ENSG00000088538 | NA |
| This gene encodes a type I transmembrane protein that is localized to junctional complexes between endothelial and epithelial cells and may have a role in cell-cell adhesion. Expression of this gene in white adipose tissue is implicated in adipocyte maturation and development of obesity. This gene is also essential for normal intestinal development and mutations in the gene are associated with congenital short bowel syndrome. | 79827 | CLMP | CXADR-like membrane protein | ENSG00000166250 | NA |
| The protein encoded by this gene is a homeobox-containing transcription factor of the POU domain family. The encoded protein binds the octamer sequence 5’-ATTTGCAT-3’, a common transcription factor binding site in immunoglobulin gene promoters. Several transcript variants encoding different isoforms have been found for this gene. | 5452 | POU2F2 | POU class 2 homeobox 2 | ENSG00000028277 | NA |
| NA | ENSG00000232320 | AC009299.5 | NA | ENSG00000232320 | NA |
| NA | 100874235 | CACNA1C-AS2 | CACNA1C antisense RNA 2 | ENSG00000256271 | NA |
| This gene belongs to the ephrin receptor subfamily of the protein-tyrosine kinase family. EPH and EPH-related receptors have been implicated in mediating developmental events, particularly in the nervous system. Receptors in the EPH subfamily typically have a single kinase domain and an extracellular region containing a Cys-rich domain and 2 fibronectin type III repeats. The ephrin receptors are divided into 2 groups based on the similarity of their extracellular domain sequences and their affinities for binding ephrin-A and ephrin-B ligands. Multiple transcript variants encoding different isoforms have been found for this gene. | 2043 | EPHA4 | EPH receptor A4 | ENSG00000116106 | NA |
| The protein encoded by this gene is a member of the p55 Stardust family of membrane-associated guanylate kinase (MAGUK) proteins, which function in the establishment of epithelial cell polarity. This family member forms a complex with the polarity protein DLG1 (discs, large homolog 1) and facilitates epithelial cell polarity and tight junction formation. Polymorphisms in this gene are associated with variations in site-specific bone mineral density (BMD). Alternative splicing results in multiple transcript variants. | 143098 | MPP7 | membrane palmitoylated protein 7 | ENSG00000150054 | NA |
| NA | 23127 | COLGALT2 | collagen beta(1-O)galactosyltransferase 2 | ENSG00000198756 | NA |
| This gene encodes a protein similar to the rat neuronal pentraxin receptor. The rat pentraxin receptor is an integral membrane protein that is thought to mediate neuronal uptake of the snake venom toxin, taipoxin, and its transport into the synapses. Studies in rat indicate that translation of this mRNA initiates at a non-AUG (CUG) codon. This may also be true for mouse and human, based on strong sequence conservation amongst these species. | 23467 | NPTXR | neuronal pentraxin receptor | ENSG00000221890 | NA |
| This gene encodes a member of the HOMER family of postsynaptic density scaffolding proteins that share a similar domain structure consisting of an N-terminal Enabled/vasodilator-stimulated phosphoprotein homology 1 domain which mediates protein-protein interactions, and a carboxy-terminal coiled-coil domain and two leucine zipper motifs that are involved in self-oligomerization. The encoded protein binds numerous other proteins including group I metabotropic glutamate receptors, inositol 1,4,5-trisphosphate receptors and amyloid precursor proteins and has been implicated in diverse biological functions such as neuronal signaling, T-cell activation and trafficking of amyloid beta peptides. Alternative splicing results in multiple transcript variants. | 9454 | HOMER3 | homer scaffolding protein 3 | ENSG00000051128 | NA |
| The gene is a member of the syntaxin family. The encoded protein is targeted to the apical membrane of epithelial cells where it forms clusters and is important in establishing and maintaining polarity necessary for protein trafficking involving vesicle fusion and exocytosis. Alternative splicing results in multiple transcript variants. | 6809 | STX3 | syntaxin 3 | ENSG00000166900 | NA |
| The obscurin gene spans more than 150 kb, contains over 80 exons and encodes a protein of approximately 720 kDa. The encoded protein contains 68 Ig domains, 2 fibronectin domains, 1 calcium/calmodulin-binding domain, 1 RhoGEF domain with an associated PH domain, and 2 serine-threonine kinase domains. This protein belongs to the family of giant sacromeric signaling proteins that includes titin and nebulin, and may have a role in the organization of myofibrils during assembly and may mediate interactions between the sarcoplasmic reticulum and myofibrils. Alternatively spliced transcript variants encoding different isoforms have been identified. | 84033 | OBSCN | obscurin, cytoskeletal calmodulin and titin-interacting RhoGEF | ENSG00000154358 | NA |
| The leucine-rich repeat (LRR) family of proteins, including LRG1, have been shown to be involved in protein-protein interaction, signal transduction, and cell adhesion and development. LRG1 is expressed during granulocyte differentiation (O’Donnell et al., 2002 [PubMed 12223515]). | 116844 | LRG1 | leucine rich alpha-2-glycoprotein 1 | ENSG00000171236 | NA |
| NA | 57214 | CEMIP | cell migration inducing hyaluronan binding protein | ENSG00000103888 | NA |
| Hexokinases phosphorylate glucose to produce glucose-6-phosphate, the first step in most glucose metabolism pathways. This gene encodes a ubiquitous form of hexokinase which localizes to the outer membrane of mitochondria. Mutations in this gene have been associated with hemolytic anemia due to hexokinase deficiency. Alternative splicing of this gene results in several transcript variants which encode different isoforms, some of which are tissue-specific. | 3098 | HK1 | hexokinase 1 | ENSG00000156515 | NA |
| NA | ENSG00000227227 | AC017101.10 | NA | ENSG00000227227 | NA |
| NA | NA | NA | NA | ENSG00000270172 | TRUE |
| NA | ENSG00000252464 | RN7SKP70 | RNA, 7SK small nuclear pseudogene 70 | ENSG00000252464 | NA |
| NA | 115572 | FAM46B | family with sequence similarity 46 member B | ENSG00000158246 | NA |
| This gene encodes a member of the annexin family. Members of this calcium-dependent phospholipid-binding protein family play a role in the regulation of cellular growth and in signal transduction pathways. This protein functions in the inhibition of phopholipase A2 and cleavage of inositol 1,2-cyclic phosphate to form inositol 1-phosphate. This protein may also play a role in anti-coagulation. | 306 | ANXA3 | annexin A3 | ENSG00000138772 | NA |
| Polyspecific organic cation transporters in the liver, kidney, intestine, and other organs are critical for elimination of many endogenous small organic cations as well as a wide array of drugs and environmental toxins. The encoded protein is an organic cation transporter and plasma integral membrane protein containing eleven putative transmembrane domains as well as a nucleotide-binding site motif. Transport by this protein is at least partially ATP-dependent. | 6583 | SLC22A4 | solute carrier family 22 member 4 | ENSG00000197208 | NA |
| This locus encodes a sulfotransferase protein. The encoded enzyme catalyzes the sulfation of a nonreducing N-acetylglucosamine residue, and may play a role in biosynthesis of 6-sulfosialyl Lewis X antigen. | 9435 | CHST2 | carbohydrate sulfotransferase 2 | ENSG00000175040 | NA |
| This gene encodes a membrane-bound protein which is a member of the ELO family, proteins which participate in the biosynthesis of fatty acids. Consistent with the expression of the encoded protein in photoreceptor cells of the retina, mutations and small deletions in this gene are associated with Stargardt-like macular dystrophy (STGD3) and autosomal dominant Stargardt-like macular dystrophy (ADMD), also referred to as autosomal dominant atrophic macular degeneration. | 6785 | ELOVL4 | ELOVL fatty acid elongase 4 | ENSG00000118402 | NA |
| This gene encodes an inwardly rectifying K+ channel which may be blocked by divalent cations. This protein is thought to be one of multiple inwardly rectifying channels which contribute to the cardiac inward rectifier current (IK1). The gene is located within the Smith-Magenis syndrome region on chromosome 17. | 3768 | KCNJ12 | potassium voltage-gated channel subfamily J member 12 | ENSG00000184185 | NA |
| This gene encodes a protein associated with the cytoplasmic surface of synaptic vesicles. A subset of patients with stiff-man syndrome who were also affected by breast cancer are positive for autoantibodies against this protein. Alternate splicing of this gene results in two transcript variants encoding different isoforms. Additional splice variants have been described, but their full length sequences have not been determined. A pseudogene of this gene is found on chromosome 11. | 273 | AMPH | amphiphysin | ENSG00000078053 | NA |
| NA | 84940 | CORO6 | coronin 6 | ENSG00000167549 | NA |
| This gene is a member of the aggrecan/versican proteoglycan family. The protein encoded is a large chondroitin sulfate proteoglycan and is a major component of the extracellular matrix. This protein is involved in cell adhesion, proliferation, proliferation, migration and angiogenesis and plays a central role in tissue morphogenesis and maintenance. Mutations in this gene are the cause of Wagner syndrome type 1. Multiple transcript variants encoding different isoforms have been found for this gene. | 1462 | VCAN | versican | ENSG00000038427 | NA |
| This gene encodes a filamentous actin-binding protein that may function in cell adhesion and migration. Mutations in this gene have been associated with dilated cardiomyopathy, also known as CMD1CC. Alternatively spliced transcript variants have been described. | 91624 | NEXN | nexilin F-actin binding protein | ENSG00000162614 | NA |
| NA | NA | NA | NA | ENSG00000229874 | TRUE |
| NA | ENSG00000239775 | AC017116.11 | NA | ENSG00000239775 | NA |
| The protein encoded by this gene belongs to the B-cell CLL/lymphoma 2 and adenovirus E1B 19 kDa interacting family, whose members play roles in many cellular processes including apotosis, cell transformation, and synaptic function. Several functions for this protein have been demonstrated including suppression of Ras homolog family member A activity, which results in reduced stress fiber formation and suppression of oncogenic cellular transformation. A high molecular weight isoform of this protein has also been shown to colocalize with Adaptor protein complex 2, beta-Adaptin and endodermal markers, suggesting an involvement in post-endocytic trafficking. In prostate cancer cells, this gene acts as a tumor suppressor and its expression is regulated by prostate cancer antigen 3, a non-protein coding gene on the opposite DNA strand in an intron of this gene. Prostate cancer antigen 3 regulates levels of this gene through formation of a double-stranded RNA that undergoes adenosine deaminase actin on RNA-dependent adenosine-to-inosine RNA editing. Alternative splicing results in multiple transcript variants. | 158471 | PRUNE2 | prune homolog 2 | ENSG00000106772 | NA |
| NA | 55365 | TMEM176A | transmembrane protein 176A | ENSG00000002933 | NA |
| NA | 10570 | DPYSL4 | dihydropyrimidinase like 4 | ENSG00000151640 | NA |
| This gene encodes a member of the latrophilin subfamily of G-protein coupled receptors (GPCR). Latrophilins may function in both cell adhesion and signal transduction. In experiments with non-human species, endogenous proteolytic cleavage within a cysteine-rich GPS (G-protein-coupled-receptor proteolysis site) domain resulted in two subunits (a large extracellular N-terminal cell adhesion subunit and a subunit with substantial similarity to the secretin/calcitonin family of GPCRs) being non-covalently bound at the cell membrane. Latrophilin-1 has been shown to recruit the neurotoxin from black widow spider venom, alpha-latrotoxin, to the synapse plasma membrane. Alternative splicing results in multiple variants encoding distinct isoforms. | 22859 | ADGRL1 | adhesion G protein-coupled receptor L1 | ENSG00000072071 | NA |
| NA | 57210 | SLC45A4 | solute carrier family 45 member 4 | ENSG00000022567 | NA |
| This gene encodes an enzyme involved in catalyzing the conversion of angiotensin I into a physiologically active peptide angiotensin II. Angiotensin II is a potent vasopressor and aldosterone-stimulating peptide that controls blood pressure and fluid-electrolyte balance. This enzyme plays a key role in the renin-angiotensin system. Many studies have associated the presence or absence of a 287 bp Alu repeat element in this gene with the levels of circulating enzyme or cardiovascular pathophysiologies. Multiple alternatively spliced transcript variants encoding different isoforms have been identified, and two most abundant spliced variants encode the somatic form and the testicular form, respectively, that are equally active. | 1636 | ACE | angiotensin I converting enzyme | ENSG00000159640 | NA |
| This gene encodes a member of the VPS10-related sortilin family of proteins. The encoded preproprotein is proteolytically processed by furin to generate the mature receptor. This receptor plays a role in the trafficking of different proteins to either the cell surface, or subcellular compartments such as lysosomes and endosomes. Expression levels of this gene may influence the risk of myocardial infarction in human patients. Alternative splicing results in multiple transcript variants. | 6272 | SORT1 | sortilin 1 | ENSG00000134243 | NA |
| NA | 64753 | CCDC136 | coiled-coil domain containing 136 | ENSG00000128596 | NA |
| G-protein signaling modulators (GPSMs) play diverse functional roles through their interaction with G-protein subunits. This gene encodes a receptor-independent activator of G protein signaling, which is one of several factors that influence the basal activity of G-protein signaling systems. The protein contains seven tetratricopeptide repeats in its N-terminal half and four G-protein regulatory (GPR) motifs in its C-terminal half. Multiple alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 26086 | GPSM1 | G-protein signaling modulator 1 | ENSG00000160360 | NA |
| The Bloom syndrome gene product is related to the RecQ subset of DExH box-containing DNA helicases and has both DNA-stimulated ATPase and ATP-dependent DNA helicase activities. Mutations causing Bloom syndrome delete or alter helicase motifs and may disable the 3’-5’ helicase activity. The normal protein may act to suppress inappropriate recombination. | 641 | BLM | Bloom syndrome RecQ like helicase | ENSG00000197299 | NA |
| The phosphatidylethanolamine (PE)-binding proteins, including PEBP4, are an evolutionarily conserved family of proteins with pivotal biologic functions, such as lipid binding and inhibition of serine proteases (Wang et al., 2004 [PubMed 15302887]). | 157310 | PEBP4 | phosphatidylethanolamine binding protein 4 | ENSG00000134020 | NA |
| This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum and is known to metabolize as many as 25% of commonly prescribed drugs. Its substrates include antidepressants, antipsychotics, analgesics and antitussives, beta adrenergic blocking agents, antiarrythmics and antiemetics. The gene is highly polymorphic in the human population; certain alleles result in the poor metabolizer phenotype, characterized by a decreased ability to metabolize the enzyme’s substrates. Some individuals with the poor metabolizer phenotype have no functional protein since they carry 2 null alleles whereas in other individuals the gene is absent. This gene can vary in copy number and individuals with the ultrarapid metabolizer phenotype can have 3 or more active copies of the gene. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 1565 | CYP2D6 | cytochrome P450 family 2 subfamily D member 6 | ENSG00000100197 | NA |
| This gene encodes a protein that belongs to the microtubule-associated protein family. The proteins of this family are thought to be involved in microtubule assembly, which is an essential step in neurogenesis. The product of this gene is a precursor polypeptide that presumably undergoes proteolytic processing to generate the final MAP1A heavy chain and LC2 light chain. Expression of this gene is almost exclusively in the brain. Studies of the rat microtubule-associated protein 1A gene suggested a role in early events of spinal cord development. | 4130 | MAP1A | microtubule associated protein 1A | ENSG00000166963 | NA |
| The protein encoded by this gene is secreted and is a serine protease inhibitor whose targets include elastase, plasmin, thrombin, trypsin, chymotrypsin, and plasminogen activator. Defects in this gene can cause emphysema or liver disease. Several transcript variants encoding the same protein have been found for this gene. | 5265 | SERPINA1 | serpin family A member 1 | ENSG00000197249 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",9,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[10,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | query | X_id | name | summary | notfound |
|---|---|---|---|---|---|
| HSPA6 | ENSG00000173110 | 3310 | heat shock protein family A (Hsp70) member 6 | NA | NA |
| HILPDA | ENSG00000135245 | 29923 | hypoxia inducible lipid droplet associated | NA | NA |
| RP11-155G14.6 | ENSG00000240758 | ENSG00000240758 | NA | NA | NA |
| PTGS2 | ENSG00000073756 | 5743 | prostaglandin-endoperoxide synthase 2 | Prostaglandin-endoperoxide synthase (PTGS), also known as cyclooxygenase, is the key enzyme in prostaglandin biosynthesis, and acts both as a dioxygenase and as a peroxidase. There are two isozymes of PTGS: a constitutive PTGS1 and an inducible PTGS2, which differ in their regulation of expression and tissue distribution. This gene encodes the inducible isozyme. It is regulated by specific stimulatory events, suggesting that it is responsible for the prostanoid biosynthesis involved in inflammation and mitogenesis. | NA |
| ZC3H12A | ENSG00000163874 | 80149 | zinc finger CCCH-type containing 12A | ZC3H12A is an MCP1 (CCL2; MIM 158105)-induced protein that acts as a transcriptional activator and causes cell death of cardiomyocytes, possibly via induction of genes associated with apoptosis. | NA |
| KRT8P50 | ENSG00000260799 | ENSG00000260799 | keratin 8 pseudogene 50 | NA | NA |
| HSPA1B | ENSG00000204388 | 3304 | heat shock protein family A (Hsp70) member 1B | This intronless gene encodes a 70kDa heat shock protein which is a member of the heat shock protein 70 family. In conjuction with other heat shock proteins, this protein stabilizes existing proteins against aggregation and mediates the folding of newly translated proteins in the cytosol and in organelles. It is also involved in the ubiquitin-proteasome pathway through interaction with the AU-rich element RNA-binding protein 1. The gene is located in the major histocompatibility complex class III region, in a cluster with two closely related genes which encode similar proteins. | NA |
| SOCS3 | ENSG00000184557 | 9021 | suppressor of cytokine signaling 3 | This gene encodes a member of the STAT-induced STAT inhibitor (SSI), also known as suppressor of cytokine signaling (SOCS), family. SSI family members are cytokine-inducible negative regulators of cytokine signaling. The expression of this gene is induced by various cytokines, including IL6, IL10, and interferon (IFN)-gamma. The protein encoded by this gene can bind to JAK2 kinase, and inhibit the activity of JAK2 kinase. Studies of the mouse counterpart of this gene suggested the roles of this gene in the negative regulation of fetal liver hematopoiesis, and placental development. | NA |
| RGS2 | ENSG00000116741 | 5997 | regulator of G-protein signaling 2 | Regulator of G protein signaling (RGS) family members are regulatory molecules that act as GTPase activating proteins (GAPs) for G alpha subunits of heterotrimeric G proteins. RGS proteins are able to deactivate G protein subunits of the Gi alpha, Go alpha and Gq alpha subtypes. They drive G proteins into their inactive GDP-bound forms. Regulator of G protein signaling 2 belongs to this family. The protein acts as a mediator of myeloid differentiation and may play a role in leukemogenesis. | NA |
| IER3 | ENSG00000137331 | 8870 | immediate early response 3 | This gene functions in the protection of cells from Fas- or tumor necrosis factor type alpha-induced apoptosis. Partially degraded and unspliced transcripts are found after virus infection in vitro, but these transcripts are not found in vivo and do not generate a valid protein. | NA |
| SERBP1P3 | ENSG00000242142 | ENSG00000242142 | SERPINE1 mRNA binding protein 1 pseudogene 3 | NA | NA |
| ARID5A | ENSG00000196843 | 10865 | AT-rich interaction domain 5A | Members of the ARID protein family, including ARID5A, have diverse functions but all appear to play important roles in development, tissue-specific gene expression, and regulation of cell growth (Patsialou et al., 2005 [PubMed 15640446]). | NA |
| CHRNE | ENSG00000108556 | 1145 | cholinergic receptor nicotinic epsilon subunit | Acetylcholine receptors at mature mammalian neuromuscular junctions are pentameric protein complexes composed of four subunits in the ratio of two alpha subunits to one beta, one epsilon, and one delta subunit. The acetylcholine receptor changes subunit composition shortly after birth when the epsilon subunit replaces the gamma subunit seen in embryonic receptors. Mutations in the epsilon subunit are associated with congenital myasthenic syndrome. | NA |
| BCL3 | ENSG00000069399 | 602 | B-cell CLL/lymphoma 3 | This gene is a proto-oncogene candidate. It is identified by its translocation into the immunoglobulin alpha-locus in some cases of B-cell leukemia. The protein encoded by this gene contains seven ankyrin repeats, which are most closely related to those found in I kappa B proteins. This protein functions as a transcriptional co-activator that activates through its association with NF-kappa B homodimers. The expression of this gene can be induced by NF-kappa B, which forms a part of the autoregulatory loop that controls the nuclear residence of p50 NF-kappa B. | NA |
| FOSL1 | ENSG00000175592 | 8061 | FOS like 1, AP-1 transcription factor subunit | The Fos gene family consists of 4 members: FOS, FOSB, FOSL1, and FOSL2. These genes encode leucine zipper proteins that can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1. As such, the FOS proteins have been implicated as regulators of cell proliferation, differentiation, and transformation. Several transcript variants encoding different isoforms have been found for this gene. | NA |
| CSRNP1 | ENSG00000144655 | 64651 | cysteine and serine rich nuclear protein 1 | This gene encodes a protein that localizes to the nucleus and expression of this gene is induced in response to elevated levels of axin. The Wnt signalling pathway, which is negatively regulated by axin, is important in axis formation in early development and impaired regulation of this signalling pathway is often involved in tumors. A decreased level of expression of this gene in tumors compared to the level of expression in their corresponding normal tissues suggests that this gene product has a tumor suppressor function. Alternative splicing results in multiple transcript variants. | NA |
| LOC105379695 | ENSG00000272273 | 105379695 | uncharacterized LOC105379695 | NA | NA |
| RP11-456P18.2 | ENSG00000229808 | ENSG00000229808 | NA | NA | NA |
| C3orf52 | ENSG00000114529 | 79669 | chromosome 3 open reading frame 52 | NA | NA |
| CDKN1A | ENSG00000124762 | 1026 | cyclin-dependent kinase inhibitor 1A | This gene encodes a potent cyclin-dependent kinase inhibitor. The encoded protein binds to and inhibits the activity of cyclin-cyclin-dependent kinase2 or -cyclin-dependent kinase4 complexes, and thus functions as a regulator of cell cycle progression at G1. The expression of this gene is tightly controlled by the tumor suppressor protein p53, through which this protein mediates the p53-dependent cell cycle G1 phase arrest in response to a variety of stress stimuli. This protein can interact with proliferating cell nuclear antigen, a DNA polymerase accessory factor, and plays a regulatory role in S phase DNA replication and DNA damage repair. This protein was reported to be specifically cleaved by CASP3-like caspases, which thus leads to a dramatic activation of cyclin-dependent kinase2, and may be instrumental in the execution of apoptosis following caspase activation. Mice that lack this gene have the ability to regenerate damaged or missing tissue. Multiple alternatively spliced variants have been found for this gene. | NA |
| NFKB2 | ENSG00000077150 | 4791 | nuclear factor kappa B subunit 2 | This gene encodes a subunit of the transcription factor complex nuclear factor-kappa-B (NFkB). The NFkB complex is expressed in numerous cell types and functions as a central activator of genes involved in inflammation and immune function. The protein encoded by this gene can function as both a transcriptional activator or repressor depending on its dimerization partner. The p100 full-length protein is co-translationally processed into a p52 active form. Chromosomal rearrangements and translocations of this locus have been observed in B cell lymphomas, some of which may result in the formation of fusion proteins. There is a pseudogene for this gene on chromosome 18. Alternative splicing results in multiple transcript variants. | NA |
| SNORA73B | ENSG00000200087 | ENSG00000200087 | small nucleolar RNA, H/ACA box 73B | NA | NA |
| SLC7A5 | ENSG00000103257 | 8140 | solute carrier family 7 member 5 | NA | NA |
| NA | ENSG00000182368 | NA | NA | NA | TRUE |
| BCL2A1 | ENSG00000140379 | 597 | BCL2 related protein A1 | This gene encodes a member of the BCL-2 protein family. The proteins of this family form hetero- or homodimers and act as anti- and pro-apoptotic regulators that are involved in a wide variety of cellular activities such as embryonic development, homeostasis and tumorigenesis. The protein encoded by this gene is able to reduce the release of pro-apoptotic cytochrome c from mitochondria and block caspase activation. This gene is a direct transcription target of NF-kappa B in response to inflammatory mediators, and is up-regulated by different extracellular signals, such as granulocyte-macrophage colony-stimulating factor (GM-CSF), CD40, phorbol ester and inflammatory cytokine TNF and IL-1, which suggests a cytoprotective function that is essential for lymphocyte activation as well as cell survival. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| CCDC150P1 | ENSG00000256304 | ENSG00000256304 | coiled-coil domain containing 150 pseudogene 1 | NA | NA |
| CHI3L1 | ENSG00000133048 | 1116 | chitinase 3 like 1 | Chitinases catalyze the hydrolysis of chitin, which is an abundant glycopolymer found in insect exoskeletons and fungal cell walls. The glycoside hydrolase 18 family of chitinases includes eight human family members. This gene encodes a glycoprotein member of the glycosyl hydrolase 18 family. The protein lacks chitinase activity and is secreted by activated macrophages, chondrocytes, neutrophils and synovial cells. The protein is thought to play a role in the process of inflammation and tissue remodeling. | NA |
| PIM1 | ENSG00000137193 | 5292 | Pim-1 proto-oncogene, serine/threonine kinase | The protein encoded by this gene belongs to the Ser/Thr protein kinase family, and PIM subfamily. This gene is expressed primarily in B-lymphoid and myeloid cell lines, and is overexpressed in hematopoietic malignancies and in prostate cancer. It plays a role in signal transduction in blood cells, contributing to both cell proliferation and survival, and thus provides a selective advantage in tumorigenesis. Both the human and orthologous mouse genes have been reported to encode two isoforms (with preferential cellular localization) resulting from the use of alternative in-frame translation initiation codons, the upstream non-AUG (CUG) and downstream AUG codons (PMIDs:16186805, 1825810). | NA |
| ZFP36 | ENSG00000128016 | 7538 | ZFP36 ring finger protein | NA | NA |
| IL4R | ENSG00000077238 | 3566 | interleukin 4 receptor | This gene encodes the alpha chain of the interleukin-4 receptor, a type I transmembrane protein that can bind interleukin 4 and interleukin 13 to regulate IgE production. The encoded protein also can bind interleukin 4 to promote differentiation of Th2 cells. A soluble form of the encoded protein can be produced by proteolysis of the membrane-bound protein, and this soluble form can inhibit IL4-mediated cell proliferation and IL5 upregulation by T-cells. Allelic variations in this gene have been associated with atopy, a condition that can manifest itself as allergic rhinitis, sinusitus, asthma, or eczema. Polymorphisms in this gene are also associated with resistance to human immunodeficiency virus type-1 infection. Alternate splicing results in multiple transcript variants. | NA |
| ATF3 | ENSG00000162772 | 467 | activating transcription factor 3 | This gene encodes a member of the mammalian activation transcription factor/cAMP responsive element-binding (CREB) protein family of transcription factors. This gene is induced by a variety of signals, including many of those encountered by cancer cells, and is involved in the complex process of cellular stress response. Multiple transcript variants encoding different isoforms have been found for this gene. It is possible that alternative splicing of this gene may be physiologically important in the regulation of target genes. | NA |
| SIK1 | ENSG00000142178 | 150094 | salt inducible kinase 1 | NA | NA |
| YBX3 | ENSG00000060138 | 8531 | Y-box binding protein 3 | NA | NA |
| TNFAIP3 | ENSG00000118503 | 7128 | TNF alpha induced protein 3 | This gene was identified as a gene whose expression is rapidly induced by the tumor necrosis factor (TNF). The protein encoded by this gene is a zinc finger protein and ubiqitin-editing enzyme, and has been shown to inhibit NF-kappa B activation as well as TNF-mediated apoptosis. The encoded protein, which has both ubiquitin ligase and deubiquitinase activities, is involved in the cytokine-mediated immune and inflammatory responses. Several transcript variants encoding the same protein have been found for this gene. | NA |
| HIST1H1E | ENSG00000168298 | 3008 | histone cluster 1, H1e | Histones are basic nuclear proteins responsible for nucleosome structure of the chromosomal fiber in eukaryotes. Two molecules of each of the four core histones (H2A, H2B, H3, and H4) form an octamer, around which approximately 146 bp of DNA is wrapped in repeating units, called nucleosomes. The linker histone, H1, interacts with linker DNA between nucleosomes and functions in the compaction of chromatin into higher order structures. This gene is intronless and encodes a replication-dependent histone that is a member of the histone H1 family. Transcripts from this gene lack polyA tails but instead contain a palindromic termination element. This gene is found in the large histone gene cluster on chromosome 6. | NA |
| MAFF | ENSG00000185022 | 23764 | MAF bZIP transcription factor F | The protein encoded by this gene is a basic leucine zipper (bZIP) transcription factor that lacks a transactivation domain. It is known to bind the US-2 DNA element in the promoter of the oxytocin receptor (OTR) gene and most likely heterodimerizes with other leucine zipper-containing proteins to enhance expression of the OTR gene during term pregnancy. The encoded protein can also form homodimers, and since it lacks a transactivation domain, the homodimer may act as a repressor of transcription. This gene may also be involved in the cellular stress response. Multiple transcript variants encoding two different isoforms have been found for this gene. | NA |
| GPR84 | ENSG00000139572 | 53831 | G protein-coupled receptor 84 | NA | NA |
| BHLHE40 | ENSG00000134107 | 8553 | basic helix-loop-helix family member e40 | This gene encodes a basic helix-loop-helix protein expressed in various tissues. The encoded protein can interact with ARNTL or compete for E-box binding sites in the promoter of PER1 and repress CLOCK/ARNTL’s transactivation of PER1. This gene is believed to be involved in the control of circadian rhythm and cell differentiation. | NA |
| PMAIP1 | ENSG00000141682 | 5366 | phorbol-12-myristate-13-acetate-induced protein 1 | NA | NA |
| GADD45B | ENSG00000099860 | 4616 | growth arrest and DNA damage inducible beta | This gene is a member of a group of genes whose transcript levels are increased following stressful growth arrest conditions and treatment with DNA-damaging agents. The genes in this group respond to environmental stresses by mediating activation of the p38/JNK pathway. This activation is mediated via their proteins binding and activating MTK1/MEKK4 kinase, which is an upstream activator of both p38 and JNK MAPKs. The function of these genes or their protein products is involved in the regulation of growth and apoptosis. These genes are regulated by different mechanisms, but they are often coordinately expressed and can function cooperatively in inhibiting cell growth. | NA |
| SLC2A3 | ENSG00000059804 | 6515 | solute carrier family 2 member 3 | NA | NA |
| RP13-638C3.2 | ENSG00000262652 | ENSG00000262652 | NA | NA | NA |
| NAMPT | ENSG00000105835 | 10135 | nicotinamide phosphoribosyltransferase | This gene encodes a protein that catalyzes the condensation of nicotinamide with 5-phosphoribosyl-1-pyrophosphate to yield nicotinamide mononucleotide, one step in the biosynthesis of nicotinamide adenine dinucleotide. The protein belongs to the nicotinic acid phosphoribosyltransferase (NAPRTase) family and is thought to be involved in many important biological processes, including metabolism, stress response and aging. This gene has a pseudogene on chromosome 10. | NA |
| LOC100506142 | ENSG00000250116 | 100506142 | uncharacterized LOC100506142 | NA | NA |
| IL1B | ENSG00000125538 | 3553 | interleukin 1 beta | The protein encoded by this gene is a member of the interleukin 1 cytokine family. This cytokine is produced by activated macrophages as a proprotein, which is proteolytically processed to its active form by caspase 1 (CASP1/ICE). This cytokine is an important mediator of the inflammatory response, and is involved in a variety of cellular activities, including cell proliferation, differentiation, and apoptosis. The induction of cyclooxygenase-2 (PTGS2/COX2) by this cytokine in the central nervous system (CNS) is found to contribute to inflammatory pain hypersensitivity. This gene and eight other interleukin 1 family genes form a cytokine gene cluster on chromosome 2. | NA |
| RP11-373D23.2 | ENSG00000270640 | ENSG00000270640 | NA | NA | NA |
| SNORA31 | ENSG00000199477 | 677814 | small nucleolar RNA, H/ACA box 31 | NA | NA |
| DUSP2 | ENSG00000158050 | 1844 | dual specificity phosphatase 2 | The protein encoded by this gene is a member of the dual specificity protein phosphatase subfamily. These phosphatases inactivate their target kinases by dephosphorylating both the phosphoserine/threonine and phosphotyrosine residues. They negatively regulate members of the mitogen-activated protein (MAP) kinase superfamily (MAPK/ERK, SAPK/JNK, p38), which are associated with cellular proliferation and differentiation. Different members of the family of dual specificity phosphatases show distinct substrate specificities for various MAP kinases, different tissue distribution and subcellular localization, and different modes of inducibility of their expression by extracellular stimuli. This gene product inactivates ERK1 and ERK2, is predominantly expressed in hematopoietic tissues, and is localized in the nucleus. | NA |
| NAMPTP1 | ENSG00000229644 | ENSG00000229644 | nicotinamide phosphoribosyltransferase pseudogene 1 | NA | NA |
| RGPD2 | ENSG00000185304 | 729857 | RANBP2-like and GRIP domain containing 2 | NA | NA |
| RP11-34P13.15 | ENSG00000268903 | ENSG00000268903 | NA | NA | NA |
| HUS1B | ENSG00000188996 | 135458 | HUS1 checkpoint clamp component B | The protein encoded by this gene is most closely related to HUS1, a component of a cell cycle checkpoint protein complex involved in cell cycle arrest in response to DNA damage. This protein can interact with the check point protein RAD1 but not with RAD9. Overexpression of this protein has been shown to induce cell death, which suggests a related but distinct role of this protein, as compared to the HUS1. | NA |
| AC017104.6 | ENSG00000224376 | ENSG00000224376 | NA | NA | NA |
| RP4-536B24.2 | ENSG00000260466 | ENSG00000260466 | NA | NA | NA |
| RP5-1056L3.3 | ENSG00000226396 | ENSG00000226396 | NA | NA | NA |
| CHMP4BP1 | ENSG00000258469 | ENSG00000258469 | charged multivesicular body protein 4B pseudogene 1 | NA | NA |
| DNAJB1 | ENSG00000132002 | 3337 | DnaJ heat shock protein family (Hsp40) member B1 | This gene encodes a member of the DnaJ or Hsp40 (heat shock protein 40 kD) family of proteins. DNAJ family members are characterized by a highly conserved amino acid stretch called the ‘J-domain’ and function as one of the two major classes of molecular chaperones involved in a wide range of cellular events, such as protein folding and oligomeric protein complex assembly. The encoded protein is a molecular chaperone that stimulates the ATPase activity of Hsp70 heat-shock proteins in order to promote protein folding and prevent misfolded protein aggregation. Alternative splicing results in multiple transcript variants. | NA |
| SBNO2 | ENSG00000064932 | 22904 | strawberry notch homolog 2 (Drosophila) | NA | NA |
| JUNB | ENSG00000171223 | 3726 | JunB proto-oncogene, AP-1 transcription factor subunit | NA | NA |
| AC005363.9 | ENSG00000255513 | ENSG00000255513 | NA | NA | NA |
| RP11-888D10.4 | ENSG00000273284 | ENSG00000273284 | NA | NA | NA |
| AC004471.9 | ENSG00000223461 | ENSG00000223461 | NA | NA | NA |
| RFX2 | ENSG00000087903 | 5990 | regulatory factor X2 | This gene is a member of the regulatory factor X gene family, which encodes transcription factors that contain a highly-conserved winged helix DNA binding domain. The protein encoded by this gene is structurally related to regulatory factors X1, X3, X4, and X5. It is a transcriptional activator that can bind DNA as a monomer or as a heterodimer with other RFX family members. This protein can bind to cis elements in the promoter of the IL-5 receptor alpha gene. Two transcript variants encoding different isoforms have been described for this gene, and both variants utilize alternative polyadenylation sites. | NA |
| RP11-563J2.3 | ENSG00000212743 | ENSG00000212743 | NA | NA | NA |
| RP11-324I22.3 | ENSG00000269952 | ENSG00000269952 | NA | NA | NA |
| MYC | ENSG00000136997 | 4609 | v-myc avian myelocytomatosis viral oncogene homolog | The protein encoded by this gene is a multifunctional, nuclear phosphoprotein that plays a role in cell cycle progression, apoptosis and cellular transformation. It functions as a transcription factor that regulates transcription of specific target genes. Mutations, overexpression, rearrangement and translocation of this gene have been associated with a variety of hematopoietic tumors, leukemias and lymphomas, including Burkitt lymphoma. There is evidence to show that alternative translation initiations from an upstream, in-frame non-AUG (CUG) and a downstream AUG start site result in the production of two isoforms with distinct N-termini. The synthesis of non-AUG initiated protein is suppressed in Burkitt’s lymphomas, suggesting its importance in the normal function of this gene. | NA |
| SNORA7A | ENSG00000207496 | 619563 | small nucleolar RNA, H/ACA box 7A | NA | NA |
| AC073410.1 | ENSG00000236047 | ENSG00000236047 | NA | NA | NA |
| AZU1 | ENSG00000172232 | 566 | azurocidin 1 | Azurophil granules, specialized lysosomes of the neutrophil, contain at least 10 proteins implicated in the killing of microorganisms. This gene encodes a preproprotein that is proteolytically processed to generate a mature azurophil granule antibiotic protein, with monocyte chemotactic and antimicrobial activity. It is also an important multifunctional inflammatory mediator. This encoded protein is a member of the serine protease gene family but it is not a serine proteinase, because the active site serine and histidine residues are replaced. The genes encoding this protein, neutrophil elastase 2, and proteinase 3 are in a cluster located at chromosome 19pter. All 3 genes are expressed coordinately and their protein products are packaged together into azurophil granules during neutrophil differentiation. | NA |
| NFIL3 | ENSG00000165030 | 4783 | nuclear factor, interleukin 3 regulated | The protein encoded by this gene is a transcriptional regulator that binds as a homodimer to activating transcription factor (ATF) sites in many cellular and viral promoters. The encoded protein represses PER1 and PER2 expression and therefore plays a role in the regulation of circadian rhythm. Three transcript variants encoding the same protein have been found for this gene. | NA |
| FBXW4P1 | ENSG00000230701 | 26226 | F-box and WD repeat domain containing 4 pseudogene 1 | NA | NA |
| PLAUR | ENSG00000011422 | 5329 | plasminogen activator, urokinase receptor | This gene encodes the receptor for urokinase plasminogen activator and, given its role in localizing and promoting plasmin formation, likely influences many normal and pathological processes related to cell-surface plasminogen activation and localized degradation of the extracellular matrix. It binds both the proprotein and mature forms of urokinase plasminogen activator and permits the activation of the receptor-bound pro-enzyme by plasmin. The protein lacks transmembrane or cytoplasmic domains and may be anchored to the plasma membrane by a glycosyl-phosphatidylinositol (GPI) moiety following cleavage of the nascent polypeptide near its carboxy-terminus. However, a soluble protein is also produced in some cell types. Alternative splicing results in multiple transcript variants encoding different isoforms. The proprotein experiences several post-translational cleavage reactions that have not yet been fully defined. | NA |
| ICAM1 | ENSG00000090339 | 3383 | intercellular adhesion molecule 1 | This gene encodes a cell surface glycoprotein which is typically expressed on endothelial cells and cells of the immune system. It binds to integrins of type CD11a / CD18, or CD11b / CD18 and is also exploited by Rhinovirus as a receptor. | NA |
| RP11-343H5.4 | ENSG00000224114 | ENSG00000224114 | NA | NA | NA |
| UBE2R2-AS1 | ENSG00000235481 | ENSG00000235481 | UBE2R2 antisense RNA 1 | NA | NA |
| RELB | ENSG00000104856 | 5971 | RELB proto-oncogene, NF-kB subunit | NA | NA |
| SNORA25 | ENSG00000207112 | 684959 | small nucleolar RNA, H/ACA box 25 | NA | NA |
| NA | ENSG00000197697 | NA | NA | NA | TRUE |
| CTRL | ENSG00000141086 | 1506 | chymotrypsin like | NA | NA |
| RNF122 | ENSG00000133874 | 79845 | ring finger protein 122 | The encoded protein contains a RING finger, a motif present in a variety of functionally distinct proteins and known to be involved in protein-protein and protein-DNA interactions. The encoded protein is localized to the endoplasmic reticulum and golgi apparatus, and may be associated with cell viability. | NA |
| CTD-2369P2.8 | ENSG00000267607 | ENSG00000267607 | NA | NA | NA |
| SLC11A1 | ENSG00000018280 | 6556 | solute carrier family 11 member 1 | This gene is a member of the solute carrier family 11 (proton-coupled divalent metal ion transporters) family and encodes a multi-pass membrane protein. The protein functions as a divalent transition metal (iron and manganese) transporter involved in iron metabolism and host resistance to certain pathogens. Mutations in this gene have been associated with susceptibility to infectious diseases such as tuberculosis and leprosy, and inflammatory diseases such as rheumatoid arthritis and Crohn disease. Alternatively spliced variants that encode different protein isoforms have been described but the full-length nature of only one has been determined. | NA |
| PIGHP1 | ENSG00000259657 | ENSG00000259657 | phosphatidylinositol glycan anchor biosynthesis class H pseudogene 1 | NA | NA |
| SNORA64 | ENSG00000207405 | 26784 | small nucleolar RNA, H/ACA box 64 | NA | NA |
| TREM1 | ENSG00000124731 | 54210 | triggering receptor expressed on myeloid cells 1 | This gene encodes a receptor belonging to the Ig superfamily that is expressed on myeloid cells. This protein amplifies neutrophil and monocyte-mediated inflammatory responses triggered by bacterial and fungal infections by stimulating release of pro-inflammatory chemokines and cytokines, as well as increased surface expression of cell activation markers. Alternatively spliced transcript variants encoding different isoforms have been noted for this gene. | NA |
| BTG3 | ENSG00000154640 | 10950 | BTG family member 3 | The protein encoded by this gene is a member of the BTG/Tob family. This family has structurally related proteins that appear to have antiproliferative properties. This encoded protein might play a role in neurogenesis in the central nervous system. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| NME2P1 | ENSG00000123009 | ENSG00000123009 | NME/NM23 nucleoside diphosphate kinase 2 pseudogene 1 | NA | NA |
| TMEM217 | ENSG00000172738 | 221468 | transmembrane protein 217 | NA | NA |
| RP11-22N19.2 | ENSG00000273320 | ENSG00000273320 | NA | NA | NA |
| SNORD10 | ENSG00000238917 | ENSG00000238917 | small nucleolar RNA, C/D box 10 | NA | NA |
| NA | ENSG00000179294 | NA | NA | NA | TRUE |
| RP11-727F15.13 | ENSG00000269463 | ENSG00000269463 | NA | NA | NA |
| CXCL1 | ENSG00000163739 | 2919 | C-X-C motif chemokine ligand 1 | This antimicrobial gene encodes a member of the CXC subfamily of chemokines. The encoded protein is a secreted growth factor that signals through the G-protein coupled receptor, CXC receptor 2. This protein plays a role in inflammation and as a chemoattractant for neutrophils. Aberrant expression of this protein is associated with the growth and progression of certain tumors. A naturally occurring processed form of this protein has increased chemotactic activity. Alternate splicing results in coding and non-coding variants of this gene. A pseudogene of this gene is found on chromosome 4. | NA |
| RP13-638C3.3 | ENSG00000262147 | ENSG00000262147 | NA | NA | NA |
| AP000593.7 | ENSG00000255843 | ENSG00000255843 | NA | NA | NA |
| RP11-269F19.2 | ENSG00000225721 | ENSG00000225721 | NA | NA | NA |
| TC2N | ENSG00000165929 | 123036 | tandem C2 domains, nuclear | NA | NA |
| AQP9 | ENSG00000103569 | 366 | aquaporin 9 | The aquaporins are a family of water-selective membrane channels. This gene encodes a member of a subset of aquaporins called the aquaglyceroporins. This protein allows passage of a broad range of noncharged solutes and also stimulates urea transport and osmotic water permeability. This protein may also facilitate the uptake of glycerol in hepatic tissue . The encoded protein may also play a role in specialized leukocyte functions such as immunological response and bactericidal activity. Alternate splicing results in multiple transcript variants. | NA |
| HIST2H2BF | ENSG00000203814 | 440689 | histone cluster 2, H2bf | Histones are basic nuclear proteins that are responsible for the nucleosome structure of the chromosomal fiber in eukaryotes. This structure consists of approximately 146 bp of DNA wrapped around a nucleosome, an octamer composed of pairs of each of the four core histones (H2A, H2B, H3, and H4). The chromatin fiber is further compacted through the interaction of a linker histone, H1, with the DNA between the nucleosomes to form higher order chromatin structures. This gene encodes a replication-dependent histone that is a member of the histone H2B family and is found in a histone cluster on chromosome 1. | NA |
| NA | ENSG00000204807 | NA | NA | NA | TRUE |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",10,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[11,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| name | summary | X_id | query | symbol | notfound |
|---|---|---|---|---|---|
| glycoprotein 2 | This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. | 2813 | ENSG00000169347 | GP2 | NA |
| spexin hormone | The protein encoded by this gene is a hormone involved in modulation of cardiovascular and renal function. It has also been shown in rats to cause weight loss. Several transcript variants have been found for this gene. | 80763 | ENSG00000134548 | SPX | NA |
| regenerating family member 1 alpha | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | 5967 | ENSG00000115386 | REG1A | NA |
| protease, serine 1 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | 5644 | ENSG00000204983 | PRSS1 | NA |
| chymotrypsin like elastase family member 3A | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. | 10136 | ENSG00000142789 | CELA3A | NA |
| polymeric immunoglobulin receptor | This gene is a member of the immunoglobulin superfamily. The encoded poly-Ig receptor binds polymeric immunoglobulin molecules at the basolateral surface of epithelial cells; the complex is then transported across the cell to be secreted at the apical surface. A significant association was found between immunoglobulin A nephropathy and several SNPs in this gene. | 5284 | ENSG00000162896 | PIGR | NA |
| pancreatic lipase | This gene is a member of the lipase gene family. It encodes a carboxyl esterase that hydrolyzes insoluble, emulsified triglycerides, and is essential for the efficient digestion of dietary fats. This gene is expressed specifically in the pancreas. | 5406 | ENSG00000175535 | PNLIP | NA |
| phospholipase A2 group IB | This gene encodes a secreted member of the phospholipase A2 (PLA2) class of enzymes, which is produced by the pancreatic acinar cells. The encoded calcium-dependent enzyme catalyzes the hydrolysis of the sn-2 position of membrane glycerophospholipids to release arachidonic acid (AA) and lysophospholipids. AA is subsequently converted by downstream metabolic enzymes to several bioactive lipophilic compounds (eicosanoids), including prostaglandins (PGs) and leukotrienes (LTs). The enzyme may be involved in several physiological processes including cell contraction, cell proliferation and pathological response. | 5319 | ENSG00000170890 | PLA2G1B | NA |
| chymotrypsin like elastase family member 3B | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3B has little elastolytic activity. Like most of the human elastases, elastase 3B is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3B preferentially cleaves proteins after alanine residues. Elastase 3B may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1, and excretion of this protein in fecal material is frequently used as a measure of pancreatic function in clinical assays. | 23436 | ENSG00000219073 | CELA3B | NA |
| fibrinogen beta chain | The protein encoded by this gene is the beta component of fibrinogen, a blood-borne glycoprotein comprised of three pairs of nonidentical polypeptide chains. Following vascular injury, fibrinogen is cleaved by thrombin to form fibrin which is the most abundant component of blood clots. In addition, various cleavage products of fibrinogen and fibrin regulate cell adhesion and spreading, display vasoconstrictor and chemotactic activities, and are mitogens for several cell types. Mutations in this gene lead to several disorders, including afibrinogenemia, dysfibrinogenemia, hypodysfibrinogenemia and thrombotic tendency. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 2244 | ENSG00000171564 | FGB | NA |
| NA | NA | ENSG00000249790 | ENSG00000249790 | RP11-20D14.6 | NA |
| carboxypeptidase A1 | This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. | 1357 | ENSG00000091704 | CPA1 | NA |
| NA | NA | ENSG00000272030 | ENSG00000272030 | RP1-178F15.4 | NA |
| colipase | The protein encoded by this gene is a cofactor needed by pancreatic lipase for efficient dietary lipid hydrolysis. It binds to the C-terminal, non-catalytic domain of lipase, thereby stabilizing an active conformation and considerably increasing the overall hydrophobic binding site. The gene product allows lipase to anchor noncovalently to the surface of lipid micelles, counteracting the destabilizing influence of intestinal bile salts. This cofactor is only expressed in pancreatic acinar cells, suggesting regulation of expression by tissue-specific elements. Three transcript variants encoding different isoforms have been found for this gene. | 1208 | ENSG00000137392 | CLPS | NA |
| syncollin | NA | 342898 | ENSG00000179751 | SYCN | NA |
| apolipoprotein C3 | Apolipoprotein C-III is a very low density lipoprotein (VLDL) protein. APOC3 inhibits lipoprotein lipase and hepatic lipase; it is thought to delay catabolism of triglyceride-rich particles. The APOA1, APOC3 and APOA4 genes are closely linked in both rat and human genomes. The A-I and A-IV genes are transcribed from the same strand, while the A-1 and C-III genes are convergently transcribed. An increase in apoC-III levels induces the development of hypertriglyceridemia. | 345 | ENSG00000110245 | APOC3 | NA |
| spondin 2 | NA | 10417 | ENSG00000159674 | SPON2 | NA |
| zymogen granule protein 16B | NA | 124220 | ENSG00000162078 | ZG16B | NA |
| mucin 7, secreted | This gene encodes a small salivary mucin, which is thought to play a role in facilitating the clearance of bacteria in the oral cavity and to aid in mastication, speech, and swallowing. The central domain of this glycoprotein contains tandem repeats, each composed of 23 amino acids. This antimicrobial protein has antibacterial and antifungal activity. The most common allele contains 6 repeats, and some alleles may be associated with susceptibility to asthma. Alternatively spliced transcript variants with different 5’ UTR, but encoding the same protein, have been found for this gene. | 4589 | ENSG00000171195 | MUC7 | NA |
| fibronectin 1 | This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | 2335 | ENSG00000115414 | FN1 | NA |
| NA | NA | NA | ENSG00000184674 | NA | TRUE |
| hedgehog acyltransferase-like | NA | 57467 | ENSG00000010282 | HHATL | NA |
| chymotrypsinogen B1 | The protein encoded by this gene is one of a family of serine proteases that is secreted into the gastrointestinal tract as an inactive precursor, which is activated by proteolytic cleavage with trypsin. | 1504 | ENSG00000168925 | CTRB1 | NA |
| charged multivesicular body protein 4C | CHMP4C belongs to the chromatin-modifying protein/charged multivesicular body protein (CHMP) family. These proteins are components of ESCRT-III (endosomal sorting complex required for transport III), a complex involved in degradation of surface receptor proteins and formation of endocytic multivesicular bodies (MVBs). Some CHMPs have both nuclear and cytoplasmic/vesicular distributions, and one such CHMP, CHMP1A (MIM 164010), is required for both MVB formation and regulation of cell cycle progression (Tsang et al., 2006 [PubMed 16730941]). | 92421 | ENSG00000164695 | CHMP4C | NA |
| neurogranin | Neurogranin (NRGN) is the human homolog of the neuron-specific rat RC3/neurogranin gene. This gene encodes a postsynaptic protein kinase substrate that binds calmodulin in the absence of calcium. The NRGN gene contains four exons and three introns. The exons 1 and 2 encode the protein and exons 3 and 4 contain untranslated sequences. It is suggested that the NRGN is a direct target for thyroid hormone in human brain, and that control of expression of this gene could underlay many of the consequences of hypothyroidism on mental states during development as well as in adult subjects. | 4900 | ENSG00000154146 | NRGN | NA |
| NA | NA | NA | ENSG00000250606 | NA | TRUE |
| pancreatic lipase related protein 1 | NA | 5407 | ENSG00000187021 | PNLIPRP1 | NA |
| secreted frizzled related protein 4 | Secreted frizzled-related protein 4 (SFRP4) is a member of the SFRP family that contains a cysteine-rich domain homologous to the putative Wnt-binding site of Frizzled proteins. SFRPs act as soluble modulators of Wnt signaling. The expression of SFRP4 in ventricular myocardium correlates with apoptosis related gene expression. | 6424 | ENSG00000106483 | SFRP4 | NA |
| nexilin F-actin binding protein | This gene encodes a filamentous actin-binding protein that may function in cell adhesion and migration. Mutations in this gene have been associated with dilated cardiomyopathy, also known as CMD1CC. Alternatively spliced transcript variants have been described. | 91624 | ENSG00000162614 | NEXN | NA |
| chymotrypsinogen B2 | NA | 440387 | ENSG00000168928 | CTRB2 | NA |
| chymotrypsin like elastase family member 2A | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Like most of the human elastases, elastase 2A is secreted from the pancreas as a zymogen. In other species, elastase 2A has been shown to preferentially cleave proteins after leucine, methionine, and phenylalanine residues. | 63036 | ENSG00000142615 | CELA2A | NA |
| nebulette | This gene encodes a nebulin like protein that is abundantly expressed in cardiac muscle. The encoded protein binds actin and interacts with thin filaments and Z-line associated proteins in striated muscle. This protein may be involved in cardiac myofibril assembly. A shorter isoform of this protein termed LIM nebulette is expressed in non-muscle cells and may function as a component of focal adhesion complexes. Alternate splicing results in multiple transcript variants. | 10529 | ENSG00000078114 | NEBL | NA |
| myosin light chain 1 | Myosin is a hexameric ATPase cellular motor protein. It is composed of two heavy chains, two nonphosphorylatable alkali light chains, and two phosphorylatable regulatory light chains. This gene encodes a myosin alkali light chain expressed in fast skeletal muscle. Two transcript variants have been identified for this gene. | 4632 | ENSG00000168530 | MYL1 | NA |
| collagen type V alpha 1 | This gene encodes an alpha chain for one of the low abundance fibrillar collagens. Fibrillar collagen molecules are trimers that can be composed of one or more types of alpha chains. Type V collagen is found in tissues containing type I collagen and appears to regulate the assembly of heterotypic fibers composed of both type I and type V collagen. This gene product is closely related to type XI collagen and it is possible that the collagen chains of types V and XI constitute a single collagen type with tissue-specific chain combinations. The encoded procollagen protein occurs commonly as the heterotrimer pro-alpha1(V)-pro-alpha1(V)-pro-alpha2(V). Mutations in this gene are associated with Ehlers-Danlos syndrome, types I and II. Alternative splicing of this gene results in multiple transcript variants. | 1289 | ENSG00000130635 | COL5A1 | NA |
| NA | NA | ENSG00000273179 | ENSG00000273179 | RP11-20I20.4 | NA |
| NA | NA | ENSG00000259279 | ENSG00000259279 | CTD-2033D15.1 | NA |
| filamin C | This gene encodes one of three related filamin genes, specifically gamma filamin. These filamin proteins crosslink actin filaments into orthogonal networks in cortical cytoplasm and participate in the anchoring of membrane proteins for the actin cytoskeleton. Three functional domains exist in filamin: an N-terminal filamentous actin-binding domain, a C-terminal self-association domain, and a membrane glycoprotein-binding domain. Two transcript variants encoding different isoforms have been found for this gene. | 2318 | ENSG00000128591 | FLNC | NA |
| NA | NA | ENSG00000268649 | ENSG00000268649 | RP4-806M20.4 | NA |
| myosin, heavy chain 1, skeletal muscle, adult | Myosin is a major contractile protein which converts chemical energy into mechanical energy through the hydrolysis of ATP. Myosin is a hexameric protein composed of a pair of myosin heavy chains (MYH) and two pairs of nonidentical light chains. Myosin heavy chains are encoded by a multigene family. In mammals at least 10 different myosin heavy chain (MYH) isoforms have been described from striated, smooth, and nonmuscle cells. These isoforms show expression that is spatially and temporally regulated during development. | 4619 | ENSG00000109061 | MYH1 | NA |
| collagen type I alpha 1 | This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIA, Ehlers-Danlos syndrome Classical type, Caffey Disease and idiopathic osteoporosis. Reciprocal translocations between chromosomes 17 and 22, where this gene and the gene for platelet-derived growth factor beta are located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans, resulting from unregulated expression of the growth factor. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | 1277 | ENSG00000108821 | COL1A1 | NA |
| TM4SF19 antisense RNA 1 | NA | 100874214 | ENSG00000235897 | TM4SF19-AS1 | NA |
| serine incorporator 2 | NA | 347735 | ENSG00000168528 | SERINC2 | NA |
| collagen type VI alpha 1 | The collagens are a superfamily of proteins that play a role in maintaining the integrity of various tissues. Collagens are extracellular matrix proteins and have a triple-helical domain as their common structural element. Collagen VI is a major structural component of microfibrils. The basic structural unit of collagen VI is a heterotrimer of the alpha1(VI), alpha2(VI), and alpha3(VI) chains. The alpha2(VI) and alpha3(VI) chains are encoded by the COL6A2 and COL6A3 genes, respectively. The protein encoded by this gene is the alpha 1 subunit of type VI collagen (alpha1(VI) chain). Mutations in the genes that code for the collagen VI subunits result in the autosomal dominant disorder, Bethlem myopathy. | 1291 | ENSG00000142156 | COL6A1 | NA |
| sarcoglycan alpha | This gene encodes a component of the dystrophin-glycoprotein complex (DGC), which is critical to the stability of muscle fiber membranes and to the linking of the actin cytoskeleton to the extracellular matrix. Its expression is thought to be restricted to striated muscle. Mutations in this gene result in type 2D autosomal recessive limb-girdle muscular dystrophy. Multiple transcript variants encoding different isoforms have been found for this gene. | 6442 | ENSG00000108823 | SGCA | NA |
| NA | NA | ENSG00000212743 | ENSG00000212743 | RP11-563J2.3 | NA |
| pepsinogen 3, group I (pepsinogen A) | This gene encodes a protein precursor of the digestive enzyme pepsin, a member of the peptidase A1 family of endopeptidases. The encoded precursor is secreted by gastric chief cells and undergoes autocatalytic cleavage in acidic conditions to form the active enzyme, which functions in the digestion of dietary proteins. This gene is found in a cluster of related genes on chromosome 11, each of which encodes one of multiple pepsinogens. Pepsinogen levels in serum may serve as a biomarker for atrophic gastritis and gastric cancer. | 643834 | ENSG00000229859 | PGA3 | NA |
| high mobility group box 1 pseudogene 3 | NA | ENSG00000250011 | ENSG00000250011 | HMGB1P3 | NA |
| tetraspanin 1 | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. | 10103 | ENSG00000117472 | TSPAN1 | NA |
| fibrinogen alpha chain | This gene encodes the alpha subunit of the coagulation factor fibrinogen, which is a component of the blood clot. Following vascular injury, the encoded preproprotein is proteolytically processed by thrombin during the conversion of fibrinogen to fibrin. Mutations in this gene lead to several disorders, including dysfibrinogenemia, hypofibrinogenemia, afibrinogenemia and renal amyloidosis. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. | 2243 | ENSG00000171560 | FGA | NA |
| orosomucoid 1 | This gene encodes a key acute phase plasma protein. Because of its increase due to acute inflammation, this protein is classified as an acute-phase reactant. The specific function of this protein has not yet been determined; however, it may be involved in aspects of immunosuppression. | 5004 | ENSG00000229314 | ORM1 | NA |
| NA | NA | NA | ENSG00000272403 | NA | TRUE |
| NA | NA | NA | ENSG00000197262 | NA | TRUE |
| sphingomyelin phosphodiesterase acid like 3A | NA | 10924 | ENSG00000172594 | SMPDL3A | NA |
| C-C motif chemokine ligand 21 | This antimicrobial gene is one of several CC cytokine genes clustered on the p-arm of chromosome 9. Cytokines are a family of secreted proteins involved in immunoregulatory and inflammatory processes. The CC cytokines are proteins characterized by two adjacent cysteines. Similar to other chemokines the protein encoded by this gene inhibits hemopoiesis and stimulates chemotaxis. This protein is chemotactic in vitro for thymocytes and activated T cells, but not for B cells, macrophages, or neutrophils. The cytokine encoded by this gene may also play a role in mediating homing of lymphocytes to secondary lymphoid organs. It is a high affinity functional ligand for chemokine receptor 7 that is expressed on T and B lymphocytes and a known receptor for another member of the cytokine family (small inducible cytokine A19). | 6366 | ENSG00000137077 | CCL21 | NA |
| myosin, heavy chain 7B, cardiac muscle, beta | The myosin II molecule is a multi-subunit complex consisting of two heavy chains and four light chains. This gene encodes a heavy chain of myosin II, which is a member of the motor-domain superfamily. The heavy chain includes a globular motor domain, which catalyzes ATP hydrolysis and interacts with actin, and a tail domain in which heptad repeat sequences promote dimerization by interacting to form a rod-like alpha-helical coiled coil. This heavy chain subunit is a slow-twitch myosin. Alternatively spliced transcript variants have been found, but the full-length nature of these variants is not determined. | 57644 | ENSG00000078814 | MYH7B | NA |
| collagen type VI alpha 2 | This gene encodes one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The product of this gene contains several domains similar to von Willebrand Factor type A domains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in this gene are associated with Bethlem myopathy and Ullrich scleroatonic muscular dystrophy. Three transcript variants have been identified for this gene. | 1292 | ENSG00000142173 | COL6A2 | NA |
| prostaglandin-endoperoxide synthase 1 | This is one of two genes encoding similar enzymes that catalyze the conversion of arachinodate to prostaglandin. The encoded protein regulates angiogenesis in endothelial cells, and is inhibited by nonsteroidal anti-inflammatory drugs such as aspirin. Based on its ability to function as both a cyclooxygenase and as a peroxidase, the encoded protein has been identified as a moonlighting protein. The protein may promote cell proliferation during tumor progression. Alternative splicing results in multiple transcript variants. | 5742 | ENSG00000095303 | PTGS1 | NA |
| heparan sulfate-glucosamine 3-sulfotransferase 3B1 | The protein encoded by this gene is a type II integral membrane protein that belongs to the 3-O-sulfotransferases family. These proteins catalyze the addition of sulfate groups at the 3-OH position of glucosamine in heparan sulfate. The substrate specificity of individual members of the family is based on prior modification of the heparan sulfate chain, thus allowing different members of the family to generate binding sites for different proteins on the same heparan sulfate chain. Following treatment with a histone deacetylase inhibitor, expression of this gene is activated in a pancreatic cell line. The increased expression results in promotion of the epithelial-mesenchymal transition. In addition, the modification catalyzed by this protein allows herpes simplex virus membrane fusion and penetration. A very closely related homolog with an almost identical sulfotransferase domain maps less than 1 Mb away. Alternative splicing results in multiple transcript variants. | 9953 | ENSG00000125430 | HS3ST3B1 | NA |
| spectrin beta, non-erythrocytic 2 | Spectrins are principle components of a cell’s membrane-cytoskeleton and are composed of two alpha and two beta spectrin subunits. The protein encoded by this gene (SPTBN2), is called spectrin beta non-erythrocytic 2 or beta-III spectrin. It is related to, but distinct from, the beta-II spectrin gene which is also known as spectrin beta non-erythrocytic 1 (SPTBN1). SPTBN2 regulates the glutamate signaling pathway by stabilizing the glutamate transporter EAAT4 at the surface of the plasma membrane. Mutations in this gene cause a form of spinocerebellar ataxia, SCA5, that is characterized by neurodegeneration, progressive locomotor incoordination, dysarthria, and uncoordinated eye movements. | 6712 | ENSG00000173898 | SPTBN2 | NA |
| copine 5 | Calcium-dependent membrane-binding proteins may regulate molecular events at the interface of the cell membrane and cytoplasm. This gene is one of several genes that encode a calcium-dependent protein containing two N-terminal type II C2 domains and an integrin A domain-like sequence in the C-terminus. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene. More variants may exist, but their full-length natures could not be determined. | 57699 | ENSG00000124772 | CPNE5 | NA |
| lysyl oxidase like 2 | This gene encodes a member of the lysyl oxidase gene family. The prototypic member of the family is essential to the biogenesis of connective tissue, encoding an extracellular copper-dependent amine oxidase that catalyses the first step in the formation of crosslinks in collagens and elastin. A highly conserved amino acid sequence at the C-terminus end appears to be sufficient for amine oxidase activity, suggesting that each family member may retain this function. The N-terminus is poorly conserved and may impart additional roles in developmental regulation, senescence, tumor suppression, cell growth control, and chemotaxis to each member of the family. | 4017 | ENSG00000134013 | LOXL2 | NA |
| smoothelin like 1 | SMTNL1, which is a member of the smoothelin (SMTN; MIM 602127) family, regulates contraction and relaxation of skeletal and smooth muscle fibers and mediates vascular adaptation to exercise (Wooldridge et al., 2008 [PubMed 18310078]). | 219537 | ENSG00000214872 | SMTNL1 | NA |
| thrombospondin 1 | The protein encoded by this gene is a subunit of a disulfide-linked homotrimeric protein. This protein is an adhesive glycoprotein that mediates cell-to-cell and cell-to-matrix interactions. This protein can bind to fibrinogen, fibronectin, laminin, type V collagen and integrins alpha-V/beta-1. This protein has been shown to play roles in platelet aggregation, angiogenesis, and tumorigenesis. | 7057 | ENSG00000137801 | THBS1 | NA |
| NA | NA | ENSG00000258376 | ENSG00000258376 | RP4-647C14.2 | NA |
| natriuretic peptide receptor 1 | Guanylyl cyclases, catalyzing the production of cGMP from GTP, are classified as soluble and membrane forms (Garbers and Lowe, 1994 [PubMed 7982997]). The membrane guanylyl cyclases, often termed guanylyl cyclases A through F, form a family of cell-surface receptors with a similar topographic structure: an extracellular ligand-binding domain, a single membrane-spanning domain, and an intracellular region that contains a protein kinase-like domain and a cyclase catalytic domain. GC-A and GC-B function as receptors for natriuretic peptides; they are also referred to as atrial natriuretic peptide receptor A (NPR1) and type B (NPR2; MIM 108961). Also see NPR3 (MIM 108962), which encodes a protein with only the ligand-binding transmembrane and 37-amino acid cytoplasmic domains. NPR1 is a membrane-bound guanylate cyclase that serves as the receptor for both atrial and brain natriuretic peptides (ANP (MIM 108780) and BNP (MIM 600295), respectively). | 4881 | ENSG00000169418 | NPR1 | NA |
| collagen type IX alpha 3 | This gene encodes one of the three alpha chains of type IX collagen, the major collagen component of hyaline cartilage. Type IX collagen, a heterotrimeric molecule, is usually found in tissues containing type II collagen, a fibrillar collagen. Mutations in this gene are associated with multiple epiphyseal dysplasia type 3. | 1299 | ENSG00000092758 | COL9A3 | NA |
| NA | NA | ENSG00000254680 | ENSG00000254680 | RP11-265D17.2 | NA |
| scavenger receptor cysteine rich family member with 5 domains | NA | 284297 | ENSG00000179954 | SSC5D | NA |
| carboxypeptidase B1 | Three different procarboxypeptidases A and two different procarboxypeptidases B have been isolated. The B1 and B2 forms differ from each other mainly in isoelectric point. Carboxypeptidase B1 is a highly tissue-specific protein and is a useful serum marker for acute pancreatitis and dysfunction of pancreatic transplants. It is not elevated in pancreatic carcinoma. | 1360 | ENSG00000153002 | CPB1 | NA |
| matrix metallopeptidase 2 | This gene is a member of the matrix metalloproteinase (MMP) gene family, that are zinc-dependent enzymes capable of cleaving components of the extracellular matrix and molecules involved in signal transduction. The protein encoded by this gene is a gelatinase A, type IV collagenase, that contains three fibronectin type II repeats in its catalytic site that allow binding of denatured type IV and V collagen and elastin. Unlike most MMP family members, activation of this protein can occur on the cell membrane. This enzyme can be activated extracellularly by proteases, or, intracellulary by its S-glutathiolation with no requirement for proteolytical removal of the pro-domain. This protein is thought to be involved in multiple pathways including roles in the nervous system, endometrial menstrual breakdown, regulation of vascularization, and metastasis. Mutations in this gene have been associated with Winchester syndrome and Nodulosis-Arthropathy-Osteolysis (NAO) syndrome. Alternative splicing results in multiple transcript variants encoding different isoforms. | 4313 | ENSG00000087245 | MMP2 | NA |
| solute carrier family 7 member 11 | This gene encodes a member of a heteromeric, sodium-independent, anionic amino acid transport system that is highly specific for cysteine and glutamate. In this system, designated Xc(-), the anionic form of cysteine is transported in exchange for glutamate. This protein has been identified as the predominant mediator of Kaposi sarcoma-associated herpesvirus fusion and entry permissiveness into cells. Also, increased expression of this gene in primary gliomas (compared to normal brain tissue) was associated with increased glutamate secretion via the XCT channels, resulting in neuronal cell death. | 23657 | ENSG00000151012 | SLC7A11 | NA |
| NA | NA | ENSG00000264272 | ENSG00000264272 | CTD-2514K5.4 | NA |
| radial spoke head 1 homolog | This gene encodes a male meiotic metaphase chromosome-associated acidic protein. This gene is expressed in tissues with motile cilia or flagella, including the trachea, lungs, airway brushings, and testes. Mutations in this gene result in primary ciliary dyskinesia-24. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 89765 | ENSG00000160188 | RSPH1 | NA |
| surfactant protein A2 | This gene is one of several genes encoding pulmonary-surfactant associated proteins (SFTPA) located on chromosome 10. Mutations in this gene and a highly similar gene located nearby, which affect the highly conserved carbohydrate recognition domain, are associated with idiopathic pulmonary fibrosis. The current version of the assembly displays only a single centromeric SFTPA gene pair rather than the two gene pairs shown in the previous assembly which were thought to have resulted from a duplication. | 729238 | ENSG00000185303 | SFTPA2 | NA |
| cerebellin 3 precursor | Members of the precerebellin family, such as CBLN3, contain a cerebellin motif (see CBLN1; MIM 600432) and a C-terminal C1q signature domain (see MIM 120550) that mediates trimeric assembly of atypical collagen complexes. However, precerebellins do not contain a collagen motif, suggesting that they are not conventional components of the extracellular matrix (Pang et al., 2000 [PubMed 10964938]). | 643866 | ENSG00000139899 | CBLN3 | NA |
| ribosomal protein L36 pseudogene 4 | NA | ENSG00000224497 | ENSG00000224497 | RPL36P4 | NA |
| ubiquitin associated and SH3 domain containing B | This gene encodes a protein that contains a ubiquitin associated domain at the N-terminus, an SH3 domain, and a C-terminal domain with similarities to the catalytic motif of phosphoglycerate mutase. The encoded protein was found to inhibit endocytosis of epidermal growth factor receptor (EGFR) and platelet-derived growth factor receptor. | 84959 | ENSG00000154127 | UBASH3B | NA |
| NA | NA | ENSG00000272512 | ENSG00000272512 | RP11-54O7.17 | NA |
| hypoxia inducible lipid droplet associated | NA | 29923 | ENSG00000135245 | HILPDA | NA |
| galectin 4 | The galectins are a family of beta-galactoside-binding proteins implicated in modulating cell-cell and cell-matrix interactions. The expression of this gene is restricted to small intestine, colon, and rectum, and it is underexpressed in colorectal cancer. | 3960 | ENSG00000171747 | LGALS4 | NA |
| chromosome 8 open reading frame 88 | NA | 100127983 | ENSG00000253250 | C8orf88 | NA |
| polypeptide N-acetylgalactosaminyltransferase 12 | This gene encodes a member of a family of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferases, which catalyze the transfer of N-acetylgalactosamine (GalNAc) from UDP-GalNAc to a serine or threonine residue on a polypeptide acceptor in the initial step of O-linked protein glycosylation. Mutations in this gene are associated with an increased susceptibility to colorectal cancer. | 79695 | ENSG00000119514 | GALNT12 | NA |
| lymphocyte activating 3 | Lymphocyte-activation protein 3 belongs to Ig superfamily and contains 4 extracellular Ig-like domains. The LAG3 gene contains 8 exons. The sequence data, exon/intron organization, and chromosomal localization all indicate a close relationship of LAG3 to CD4. | 3902 | ENSG00000089692 | LAG3 | NA |
| collagen type I alpha 2 chain | This gene encodes the pro-alpha2 chain of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIB, recessive Ehlers-Danlos syndrome Classical type, idiopathic osteoporosis, and atypical Marfan syndrome. Symptoms associated with mutations in this gene, however, tend to be less severe than mutations in the gene for the alpha1 chain of type I collagen (COL1A1) reflecting the different role of alpha2 chains in matrix integrity. Three transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | 1278 | ENSG00000164692 | COL1A2 | NA |
| integrin subunit alpha 5 | The product of this gene belongs to the integrin alpha chain family. Integrins are heterodimeric integral membrane proteins composed of an alpha subunit and a beta subunit that function in cell surface adhesion and signaling. The encoded preproprotein is proteolytically processed to generate light and heavy chains that comprise the alpha 5 subunit. This subunit associates with the beta 1 subunit to form a fibronectin receptor. This integrin may promote tumor invasion, and higher expression of this gene may be correlated with shorter survival time in lung cancer patients. Note that the integrin alpha 5 and integrin alpha V subunits are encoded by distinct genes. | 3678 | ENSG00000161638 | ITGA5 | NA |
| tissue differentiation-inducing non-protein coding RNA | This gene produces a spliced long non-coding RNA that is required for normal epidermal differentiation. This transcript regulates the expression of genes involved in the differentiation of epidermal tissue. Mutations in some of the genes targeted by this transcript have been implicated in epidermal skin diseases. | 257000 | ENSG00000223573 | TINCR | NA |
| lipase F, gastric type | This gene encodes gastric lipase, an enzyme involved in the digestion of dietary triglycerides in the gastrointestinal tract, and responsible for 30% of fat digestion processes occurring in human. It is secreted by gastric chief cells in the fundic mucosa of the stomach, and it hydrolyzes the ester bonds of triglycerides under acidic pH conditions. The gene is a member of a conserved gene family of lipases that play distinct roles in neutral lipid metabolism. Several transcript variants encoding different isoforms have been found for this gene. | 8513 | ENSG00000182333 | LIPF | NA |
| apolipoprotein B mRNA editing enzyme catalytic subunit 3F | This gene is a member of the cytidine deaminase gene family. It is one of seven related genes or pseudogenes found in a cluster, thought to result from gene duplication, on chromosome 22. Members of the cluster encode proteins that are structurally and functionally related to the C to U RNA-editing cytidine deaminase APOBEC1. It is thought that the proteins may be RNA editing enzymes and have roles in growth or cell cycle control. Alternatively spliced transcript variants encoding different isoforms have been identified. | 200316 | ENSG00000128394 | APOBEC3F | NA |
| AHNAK nucleoprotein 2 | NA | 113146 | ENSG00000185567 | AHNAK2 | NA |
| hydroxysteroid 11-beta dehydrogenase 1 | The protein encoded by this gene is a microsomal enzyme that catalyzes the conversion of the stress hormone cortisol to the inactive metabolite cortisone. In addition, the encoded protein can catalyze the reverse reaction, the conversion of cortisone to cortisol. Too much cortisol can lead to central obesity, and a particular variation in this gene has been associated with obesity and insulin resistance in children. Mutations in this gene and H6PD (hexose-6-phosphate dehydrogenase (glucose 1-dehydrogenase)) are the cause of cortisone reductase deficiency. Alternate splicing results in multiple transcript variants encoding the same protein. | 3290 | ENSG00000117594 | HSD11B1 | NA |
| PCOLCE antisense RNA 1 | NA | 100129845 | ENSG00000224729 | PCOLCE-AS1 | NA |
| metallothionein 3 | NA | 4504 | ENSG00000087250 | MT3 | NA |
| hemopexin | This gene encodes a plasma glycoprotein that binds heme with high affinity. The encoded protein is an acute phase protein that transports heme from the plasma to the liver and may be involved in protecting cells from oxidative stress. | 3263 | ENSG00000110169 | HPX | NA |
| egl-9 family hypoxia inducible factor 3 | NA | 112399 | ENSG00000129521 | EGLN3 | NA |
| elastin microfibril interfacer 1 | This gene encodes an extracellular matrix glycoprotein that is characterized by an N-terminal microfibril interface domain, a coiled-coiled alpha-helical domain, a collagenous domain and a C-terminal globular C1q domain. The encoded protein associates with elastic fibers at the interface between elastin and microfibrils and may play a role in the development of elastic tissues including large blood vessels, dermis, heart and lung. | 11117 | ENSG00000138080 | EMILIN1 | NA |
| DLGAP1 antisense RNA 1 | NA | ENSG00000177337 | ENSG00000177337 | DLGAP1-AS1 | NA |
| prolactin induced protein | NA | 5304 | ENSG00000159763 | PIP | NA |
| glutathione S-transferase mu 1 | Cytosolic and membrane-bound forms of glutathione S-transferase are encoded by two distinct supergene families. At present, eight distinct classes of the soluble cytoplasmic mammalian glutathione S-transferases have been identified: alpha, kappa, mu, omega, pi, sigma, theta and zeta. This gene encodes a glutathione S-transferase that belongs to the mu class. The mu class of enzymes functions in the detoxification of electrophilic compounds, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress, by conjugation with glutathione. The genes encoding the mu class of enzymes are organized in a gene cluster on chromosome 1p13.3 and are known to be highly polymorphic. These genetic variations can change an individual’s susceptibility to carcinogens and toxins as well as affect the toxicity and efficacy of certain drugs. Null mutations of this class mu gene have been linked with an increase in a number of cancers, likely due to an increased susceptibility to environmental toxins and carcinogens. Multiple protein isoforms are encoded by transcript variants of this gene. | 2944 | ENSG00000134184 | GSTM1 | NA |
| neurexophilin 3 | NA | 11248 | ENSG00000182575 | NXPH3 | NA |
| NA | NA | ENSG00000257433 | ENSG00000257433 | RP1-197B17.3 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",11,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[12,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| X_id | summary | name | symbol | query | notfound |
|---|---|---|---|---|---|
| 1488 | This gene produces alternative transcripts encoding two distinct proteins. One protein is a transcriptional repressor, while the other isoform is a major component of specialized synapses known as synaptic ribbons. Both proteins contain a NAD+ binding domain similar to NAD+-dependent 2-hydroxyacid dehydrogenases. A portion of the 3’ untranslated region was used to map this gene to chromosome 21q21.3; however, it was noted that similar loci elsewhere in the genome are likely. Blast analysis shows that this gene is present on chromosome 10. Several transcript variants encoding two different isoforms have been found for this gene. | C-terminal binding protein 2 | CTBP2 | ENSG00000175029 | NA |
| 5319 | This gene encodes a secreted member of the phospholipase A2 (PLA2) class of enzymes, which is produced by the pancreatic acinar cells. The encoded calcium-dependent enzyme catalyzes the hydrolysis of the sn-2 position of membrane glycerophospholipids to release arachidonic acid (AA) and lysophospholipids. AA is subsequently converted by downstream metabolic enzymes to several bioactive lipophilic compounds (eicosanoids), including prostaglandins (PGs) and leukotrienes (LTs). The enzyme may be involved in several physiological processes including cell contraction, cell proliferation and pathological response. | phospholipase A2 group IB | PLA2G1B | ENSG00000170890 | NA |
| 8660 | This gene encodes the insulin receptor substrate 2, a cytoplasmic signaling molecule that mediates effects of insulin, insulin-like growth factor 1, and other cytokines by acting as a molecular adaptor between diverse receptor tyrosine kinases and downstream effectors. The product of this gene is phosphorylated by the insulin receptor tyrosine kinase upon receptor stimulation, as well as by an interleukin 4 receptor-associated kinase in response to IL4 treatment. | insulin receptor substrate 2 | IRS2 | ENSG00000185950 | NA |
| 100129550 | NA | uncharacterized LOC100129550 | LOC100129550 | ENSG00000273033 | NA |
| 51635 | This gene encodes a member of the short-chain dehydrogenases/reductases (SDR) family, which has over 46,000 members. Members in this family are enzymes that metabolize many different compounds, such as steroid hormones, prostaglandins, retinoids, lipids and xenobiotics. | dehydrogenase/reductase 7 | DHRS7 | ENSG00000100612 | NA |
| 51621 | KLF13 belongs to a family of transcription factors that contain 3 classical zinc finger DNA-binding domains consisting of a zinc atom tetrahedrally coordinated by 2 cysteines and 2 histidines (C2H2 motif). These transcription factors bind to GC-rich sequences and related GT and CACCC boxes (Scohy et al., 2000 [PubMed 11087666]). | Kruppel like factor 13 | KLF13 | ENSG00000169926 | NA |
| 64759 | NA | tensin 3 | TNS3 | ENSG00000136205 | NA |
| 11060 | This gene encodes a member of the Nedd4 family of E3 ligases, which play an important role in protein ubiquitination. The encoded protein contains four WW domains and may play a role in multiple processes including chondrogenesis and the regulation of oncogenic signaling pathways via interactions with Smad proteins and the tumor suppressor PTEN. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene, and a pseudogene of this gene is located on the long arm of chromosome 10. | WW domain containing E3 ubiquitin protein ligase 2 | WWP2 | ENSG00000198373 | NA |
| 80762 | The protein encoded by this gene belongs to a small group of evolutionarily conserved proteins with three transmembrane domains. It is a potential target for ubiquitination by the Nedd4 family of proteins. This protein is thought to be part of a family of integral Golgi membrane proteins. | Nedd4 family interacting protein 1 | NDFIP1 | ENSG00000131507 | NA |
| 6304 | This gene encodes a matrix protein which binds nuclear matrix and scaffold-associating DNAs through a unique nuclear architecture. The protein recruits chromatin-remodeling factors in order to regulate chromatin structure and gene expression. | SATB homeobox 1 | SATB1 | ENSG00000182568 | NA |
| 6809 | The gene is a member of the syntaxin family. The encoded protein is targeted to the apical membrane of epithelial cells where it forms clusters and is important in establishing and maintaining polarity necessary for protein trafficking involving vesicle fusion and exocytosis. Alternative splicing results in multiple transcript variants. | syntaxin 3 | STX3 | ENSG00000166900 | NA |
| 57515 | NA | serine incorporator 1 | SERINC1 | ENSG00000111897 | NA |
| ENSG00000235027 | NA | NA | AC068580.6 | ENSG00000235027 | NA |
| 54893 | NA | myotubularin related protein 10 | MTMR10 | ENSG00000166912 | NA |
| 10618 | This gene encodes a type I integral membrane protein that is localized to the trans-Golgi network, a major sorting station for secretory and membrane proteins. The encoded protein cycles between early endosomes and the trans-Golgi network, and may play a role in exocytic vesicle formation. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | trans-golgi network protein 2 | TGOLN2 | ENSG00000152291 | NA |
| 65010 | This gene belongs to the solute carrier 26 family, whose members encode anion transporter proteins. This particular family member encodes a protein involved in transporting chloride, oxalate, sulfate and bicarbonate. Alternatively spliced transcript variants encoding distinct isoforms have been described. | solute carrier family 26 member 6 | SLC26A6 | ENSG00000225697 | NA |
| ENSG00000225313 | NA | NA | RP11-415J8.3 | ENSG00000225313 | NA |
| 2776 | This locus encodes a guanine nucleotide-binding protein. The encoded protein, an alpha subunit in the Gq class, couples a seven-transmembrane domain receptor to activation of phospolipase C-beta. Mutations at this locus have been associated with problems in platelet activation and aggregation. A related pseudogene exists on chromosome 2. | G protein subunit alpha q | GNAQ | ENSG00000156052 | NA |
| 334 | This gene encodes amyloid precursor- like protein 2 (APLP2), which is a member of the APP (amyloid precursor protein) family including APP, APLP1 and APLP2. This protein is ubiquitously expressed. It contains heparin-, copper- and zinc- binding domains at the N-terminus, BPTI/Kunitz inhibitor and E2 domains in the middle region, and transmembrane and intracellular domains at the C-terminus. This protein interacts with major histocompatibility complex (MHC) class I molecules. The synergy of this protein and the APP is required to mediate neuromuscular transmission, spatial learning and synaptic plasticity. This protein has been implicated in the pathogenesis of Alzheimer’s disease. Multiple alternatively spliced transcript variants encoding different isoforms have been identified. | amyloid beta precursor like protein 2 | APLP2 | ENSG00000084234 | NA |
| 80344 | This gene encodes a WD repeat-containing protein that interacts with the COP9 signalosome, a macromolecular complex that interacts with cullin-RING E3 ligases and regulates their activity by hydrolyzing cullin-Nedd8 conjugates. Multiple alternatively spliced transcript variants have been found for this gene. | DDB1 and CUL4 associated factor 11 | DCAF11 | ENSG00000100897 | NA |
| 2289 | The protein encoded by this gene is a member of the immunophilin protein family, which play a role in immunoregulation and basic cellular processes involving protein folding and trafficking. This encoded protein is a cis-trans prolyl isomerase that binds to the immunosuppressants FK506 and rapamycin. It is thought to mediate calcineurin inhibition. It also interacts functionally with mature hetero-oligomeric progesterone receptor complexes along with the 90 kDa heat shock protein and P23 protein. This gene has been found to have multiple polyadenylation sites. Alternative splicing results in multiple transcript variants. | FK506 binding protein 5 | FKBP5 | ENSG00000096060 | NA |
| 64855 | NA | family with sequence similarity 129 member B | FAM129B | ENSG00000136830 | NA |
| ENSG00000271643 | NA | NA | RP11-10C24.3 | ENSG00000271643 | NA |
| 100505635 | NA | uncharacterized LOC100505635 | LOC100505635 | ENSG00000235033 | NA |
| 3556 | Interleukin 1 induces synthesis of acute phase and proinflammatory proteins during infection, tissue damage, or stress, by forming a complex at the cell membrane with an interleukin 1 receptor and an accessory protein. This gene encodes the interleukin 1 receptor accessory protein. The protein is a necessary part of the interleukin 1 receptor complex which initiates signalling events that result in the activation of interleukin 1-responsive genes. Alternative splicing of this gene results in two transcript variants encoding two different isoforms, one membrane-bound and one soluble. The ratio of soluble to membrane-bound forms increases during acute-phase induction or stress. | interleukin 1 receptor accessory protein | IL1RAP | ENSG00000196083 | NA |
| NA | NA | NA | NA | ENSG00000255813 | TRUE |
| ENSG00000271862 | NA | NA | RP11-343L5.2 | ENSG00000271862 | NA |
| 150967 | NA | DKFZp434H1419 | PKI55 | ENSG00000260804 | NA |
| 9788 | NA | metastasis suppressor 1 | MTSS1 | ENSG00000170873 | NA |
| ENSG00000257715 | NA | NA | RP11-256L6.2 | ENSG00000257715 | NA |
| 1509 | This gene encodes a member of the A1 family of peptidases. The encoded preproprotein is proteolytically processed to generate multiple protein products. These products include the cathepsin D light and heavy chains, which heterodimerize to form the mature enzyme. This enzyme exhibits pepsin-like activity and plays a role in protein turnover and in the proteolytic activation of hormones and growth factors. Mutations in this gene play a causal role in neuronal ceroid lipofuscinosis-10 and may be involved in the pathogenesis of several other diseases, including breast cancer and possibly Alzheimer’s disease. | cathepsin D | CTSD | ENSG00000117984 | NA |
| ENSG00000242960 | NA | ferritin, heavy polypeptide 1 pseudogene 23 | FTH1P23 | ENSG00000242960 | NA |
| 84925 | This gene encodes a membrane-bound protein from the major facilitator superfamily of transporters. Disruption of this gene by translocation has been associated with haplo-insufficiency and renal cell carcinomas. Alternatively spliced transcript variants have been described, but their biological validity has not yet been determined. | disrupted in renal carcinoma 2 | DIRC2 | ENSG00000138463 | NA |
| ENSG00000255670 | NA | NA | RP11-253I19.3 | ENSG00000255670 | NA |
| 54918 | This gene belongs to the chemokine-like factor gene superfamily, a novel family that is similar to the chemokine and transmembrane 4 superfamilies. This gene is one of several chemokine-like factor genes located in a cluster on chromosome 3. This gene is widely expressed in many tissues, but the exact function of the encoded protein is unknown. | CKLF like MARVEL transmembrane domain containing 6 | CMTM6 | ENSG00000091317 | NA |
| 5339 | Plectin is a prominent member of an important family of structurally and in part functionally related proteins, termed plakins or cytolinkers, that are capable of interlinking different elements of the cytoskeleton. Plakins, with their multi-domain structure and enormous size, not only play crucial roles in maintaining cell and tissue integrity and orchestrating dynamic changes in cytoarchitecture and cell shape, but also serve as scaffolding platforms for the assembly, positioning, and regulation of signaling complexes (reviewed in PMID: 9701547, 11854008, and 17499243). Plectin is expressed as several protein isoforms in a wide range of cell types and tissues from a single gene located on chromosome 8 in humans (PMID: 8633055, 8698233). Until 2010, this locus was named plectin 1 (symbol PLEC1 in human; Plec1 in mouse and rat) and the gene product had been referred to as ‘hemidesmosomal protein 1’ or ‘plectin 1, intermediate filament binding 500kDa’. These names were superseded by plectin. The plectin gene locus in mouse on chromosome 15 has been analyzed in detail (PMID: 10556294, 14559777), revealing a genomic exon-intron organization with well over 40 exons spanning over 62 kb and an unusual 5’ transcript complexity of plectin isoforms. Eleven exons (1-1j) have been identified that alternatively splice directly into a common exon 2 which is the first exon to encode plectin’s highly conserved actin binding domain (ABD). Three additional exons (-1, 0a, and 0) splice into an alternative first coding exon (1c), and two additional exons (2alpha and 3alpha) are optionally spliced within the exons encoding the acting binding domain (exons 2-8). Analysis of the human locus has identified eight of the eleven alternative 5’ exons found in mouse and rat (PMID: 14672974); exons 1i, 1j and 1h have not been confirmed in human. Furthermore, isoforms lacking the central rod domain encoded by exon 31 have been detected in mouse (PMID:10556294), rat (PMID: 9177781), and human (PMID: 11441066, 10780662, 20052759). The short alternative amino-terminal sequences encoded by the different first exons direct the targeting of the various isoforms to distinct subcellular locations (PMID: 14559777). As the expression of specific plectin isoforms was found to be dependent on cell type (tissue) and stage of development (PMID: 10556294, 12542521, 17389230) it appears that each cell type (tissue) contains a unique set (proportion and composition) of plectin isoforms, as if custom-made for specific requirements of the particular cells. Concordantly, individual isoforms were found to carry out distinct and specific functions (PMID: 14559777, 12542521, 18541706). In 1996, a number of groups reported that patients suffering from epidermolysis bullosa simplex with muscular dystrophy (EBS-MD) lacked plectin expression in skin and muscle tissues due to defects in the plectin gene (PMID: 8698233, 8941634, 8636409, 8894687, 8696340). Two other subtypes of plectin-related EBS have been described: EBS-pyloric atresia (PA) and EBS-Ogna. For reviews of plectin-related diseases see PMID: 15810881, 19945614. Mutations in the plectin gene related to human diseases should be named based on the position in NM_000445 (variant 1, isoform 1c), unless the mutation is located within one of the other alternative first exons, in which case the position in the respective Reference Sequence should be used. | plectin | PLEC | ENSG00000178209 | NA |
| 3732 | This metastasis suppressor gene product is a membrane glycoprotein that is a member of the transmembrane 4 superfamily. Expression of this gene has been shown to be downregulated in tumor progression of human cancers and can be activated by p53 through a consensus binding sequence in the promoter. Its expression and that of p53 are strongly correlated, and the loss of expression of these two proteins is associated with poor survival for prostate cancer patients. Two alternatively spliced transcript variants encoding distinct isoforms have been found for this gene. | CD82 molecule | CD82 | ENSG00000085117 | NA |
| 90 | Activins are dimeric growth and differentiation factors which belong to the transforming growth factor-beta (TGF-beta) superfamily of structurally related signaling proteins. Activins signal through a heteromeric complex of receptor serine kinases which include at least two type I ( I and IB) and two type II (II and IIB) receptors. These receptors are all transmembrane proteins, composed of a ligand-binding extracellular domain with cysteine-rich region, a transmembrane domain, and a cytoplasmic domain with predicted serine/threonine specificity. Type I receptors are essential for signaling; and type II receptors are required for binding ligands and for expression of type I receptors. Type I and II receptors form a stable complex after ligand binding, resulting in phosphorylation of type I receptors by type II receptors. This gene encodes activin A type I receptor which signals a particular transcriptional response in concert with activin type II receptors. Mutations in this gene are associated with fibrodysplasia ossificans progressive. | activin A receptor type 1 | ACVR1 | ENSG00000115170 | NA |
| NA | NA | NA | NA | ENSG00000256845 | TRUE |
| 137886 | NA | UBX domain protein 2B | UBXN2B | ENSG00000215114 | NA |
| 206358 | This gene encodes a member of the eukaryote-specific amino acid/auxin permease (AAAP) 1 transporter family. The encoded protein functions as a proton-dependent, small amino acid transporter. This gene is clustered with related family members on chromosome 5q33.1. Alternative splicing results in multiple transcript variants. | solute carrier family 36 member 1 | SLC36A1 | ENSG00000123643 | NA |
| 57506 | This gene encodes an intermediary protein necessary in the virus-triggered beta interferon signaling pathways. It is required for activation of transcription factors which regulate expression of beta interferon and contributes to antiviral immunity. Multiple transcript variants encoding different isoforms have been found for this gene. | mitochondrial antiviral signaling protein | MAVS | ENSG00000088888 | NA |
| 9802 | This gene encodes a proline-rich protein which interacts with the deleted in azoospermia (DAZ) and the deleted in azoospermia-like gene through the DAZ-like repeats. This protein also interacts with the transforming growth factor-beta signaling molecule SARA (Smad anchor for receptor activation), eukaryotic initiation factor 4G, and an E3 ubiquitinase that regulates its stability in splicing factor containing nuclear speckles. The encoded protein may function in various biological and pathological processes including spermatogenesis, cell signaling and transcription regulation, formation of stress granules during translation arrest, RNA splicing, and pathogenesis of multiple myeloma. Multiple transcript variants encoding different isoforms have been found for this gene. | DAZ associated protein 2 | DAZAP2 | ENSG00000183283 | NA |
| ENSG00000256448 | NA | NA | RP11-809N8.4 | ENSG00000256448 | NA |
| ENSG00000255680 | NA | NA | RP11-732A19.9 | ENSG00000255680 | NA |
| 4942 | This gene encodes the mitochondrial enzyme ornithine aminotransferase, which is a key enzyme in the pathway that converts arginine and ornithine into the major excitatory and inhibitory neurotransmitters glutamate and GABA. Mutations that result in a deficiency of this enzyme cause the autosomal recessive eye disease Gyrate Atrophy. Alternatively spliced transcript variants encoding different isoforms have been described. Related pseudogenes have been defined on the X chromosome. | ornithine aminotransferase | OAT | ENSG00000065154 | NA |
| 9375 | This gene encodes a member of the transmembrane 9 superfamily. The encoded 76 kDa protein localizes to early endosomes in human cells. The encoded protein possesses a conserved and highly hydrophobic C-terminal domain which contains nine transmembrane domains. The protein may play a role in small molecule transport or act as an ion channel. A pseudogene associated with this gene is located on the X chromosome. | transmembrane 9 superfamily member 2 | TM9SF2 | ENSG00000125304 | NA |
| 9637 | This gene is an ortholog of the C. elegans unc-76 gene, which is necessary for normal axonal bundling and elongation within axon bundles. Other orthologs include the rat gene that encodes zygin II, which can bind to synaptotagmin. | fasciculation and elongation protein zeta 2 | FEZ2 | ENSG00000171055 | NA |
| ENSG00000233739 | NA | NA | RP5-1039K5.13 | ENSG00000233739 | NA |
| 63971 | This gene encodes a member of the kinesin family of microtubule-based motor proteins that function in the positioning of endosomes. This family member can direct mannose-6-phosphate receptor-containing vesicles from the trans-Golgi network to the plasma membrane, and it is necessary for the steady-state distribution of late endosomes/lysosomes. It is also required for the translocation of FYVE-CENT and TTC19 from the centrosome to the midbody during cytokinesis, and it plays a role in melanosome maturation. Alternative splicing of this gene results in multiple transcript variants. | kinesin family member 13A | KIF13A | ENSG00000137177 | NA |
| 10079 | NA | ATPase phospholipid transporting 9A (putative) | ATP9A | ENSG00000054793 | NA |
| 6655 | NA | SOS Ras/Rho guanine nucleotide exchange factor 2 | SOS2 | ENSG00000100485 | NA |
| 9778 | NA | KIAA0232 | KIAA0232 | ENSG00000170871 | NA |
| 388115 | NA | chromosome 15 open reading frame 52 | C15orf52 | ENSG00000188549 | NA |
| ENSG00000257831 | NA | NA | RP11-596D21.1 | ENSG00000257831 | NA |
| 54414 | This gene encodes an enzyme which removes 9-O-acetylation modifications from sialic acids. Mutations in this gene are associated with susceptibility to autoimmune disease 6. Multiple transcript variants encoding different isoforms, found either in the cytosol or in the lysosome, have been found for this gene. | sialic acid acetylesterase | SIAE | ENSG00000110013 | NA |
| 7009 | NA | transmembrane BAX inhibitor motif containing 6 | TMBIM6 | ENSG00000139644 | NA |
| 55755 | This gene encodes a regulator of CDK5 (cyclin-dependent kinase 5) activity. The protein encoded by this gene is localized to the centrosome and Golgi complex, interacts with CDK5R1 and pericentrin (PCNT), plays a role in centriole engagement and microtubule nucleation, and has been linked to primary microcephaly and Alzheimer’s disease. Alternative splicing results in multiple transcript variants. | CDK5 regulatory subunit associated protein 2 | CDK5RAP2 | ENSG00000136861 | NA |
| 9580 | This gene encodes a member of the SOX (SRY-related HMG-box) family of transcription factors involved in the regulation of embryonic development and in the determination of cell fate. The encoded protein may act as a transcriptional regulator after forming a protein complex with other proteins. It has also been determined to be a type-1 diabetes autoantigen, also known as islet cell antibody 12. | SRY-box 13 | SOX13 | ENSG00000143842 | NA |
| ENSG00000254693 | NA | NA | RP11-58K22.5 | ENSG00000254693 | NA |
| 10677 | The protein encoded by this gene is a member of the gelsolin/villin family of actin regulatory proteins. This protein has structural similarity to villin. It binds actin and may play a role in the development of neuronal cells that form ganglia. | advillin | AVIL | ENSG00000135407 | NA |
| 9728 | NA | SECIS binding protein 2 like | SECISBP2L | ENSG00000138593 | NA |
| ENSG00000272659 | NA | NA | AP000295.10 | ENSG00000272659 | NA |
| 9236 | NA | cell cycle progression 1 | CCPG1 | ENSG00000260916 | NA |
| 58472 | The protein encoded by this gene may function in mitochondria to catalyze the conversion of sulfide to persulfides, thereby decreasing toxic concencrations of sulfide. Alternative splicing results in multiple transcript variants that encode the same protein. | sulfide quinone reductase-like (yeast) | SQRDL | ENSG00000137767 | NA |
| 9826 | Rho GTPases play a fundamental role in numerous cellular processes that are initiated by extracellular stimuli that work through G protein coupled receptors. The encoded protein may form a complex with G proteins and stimulate Rho-dependent signals. A similar protein in rat interacts with glutamate transporter EAAT4 and modulates its glutamate transport activity. Expression of the rat protein induces the reorganization of the actin cytoskeleton and its overexpression induces the formation of membrane ruffling and filopodia. Two alternative transcripts encoding different isoforms have been described. | Rho guanine nucleotide exchange factor 11 | ARHGEF11 | ENSG00000132694 | NA |
| 146223 | This gene belongs to the chemokine-like factor gene superfamily, a novel family that is similar to the chemokine and the transmembrane 4 superfamilies of signaling molecules. This gene is one of several chemokine-like factor genes located in a cluster on chromosome 16. Alternatively spliced transcript variants encoding different isoforms have been identified. | CKLF like MARVEL transmembrane domain containing 4 | CMTM4 | ENSG00000183723 | NA |
| NA | NA | NA | NA | ENSG00000272091 | TRUE |
| 9936 | CD302 is a C-type lectin receptor involved in cell adhesion and migration, as well as endocytosis and phagocytosis (Kato et al., 2007 [PubMed 17947679]). | CD302 molecule | CD302 | ENSG00000241399 | NA |
| 115548 | NA | FCH domain only 2 | FCHO2 | ENSG00000157107 | NA |
| 3092 | The product of this gene is a membrane-associated protein that functions in clathrin-mediated endocytosis and protein trafficking within the cell. The encoded protein binds to the huntingtin protein in the brain; this interaction is lost in Huntington’s disease. Alternative splicing results in multiple transcript variants. | huntingtin interacting protein 1 | HIP1 | ENSG00000127946 | NA |
| ENSG00000231025 | NA | NA | RP11-175O19.4 | ENSG00000231025 | NA |
| 57222 | This gene encodes a cycling membrane protein which is an endoplasmic reticulum-golgi intermediate compartment (ERGIC) protein which interacts with other members of this protein family to increase their turnover. | endoplasmic reticulum-golgi intermediate compartment 1 | ERGIC1 | ENSG00000113719 | NA |
| NA | NA | NA | NA | ENSG00000203305 | TRUE |
| ENSG00000259468 | NA | NA | RP11-1084A12.2 | ENSG00000259468 | NA |
| 8741 | The protein encoded by this gene is a member of the tumor necrosis factor (TNF) ligand family. This protein is a ligand for TNFRSF17/BCMA, a member of the TNF receptor family. This protein and its receptor are both found to be important for B cell development. In vitro experiments suggested that this protein may be able to induce apoptosis through its interaction with other TNF receptor family proteins such as TNFRSF6/FAS and TNFRSF14/HVEM. Alternative splicing results in multiple transcript variants. Some transcripts that skip the last exon of the upstream gene (TNFSF12) and continue into the second exon of this gene have been identified; such read-through transcripts are contained in GeneID 407977, TNFSF12-TNFSF13. | tumor necrosis factor superfamily member 13 | TNFSF13 | ENSG00000161955 | NA |
| ENSG00000234975 | NA | ferritin, heavy polypeptide 1 pseudogene 2 | FTH1P2 | ENSG00000234975 | NA |
| ENSG00000232187 | NA | ferritin, heavy polypeptide 1 pseudogene 7 | FTH1P7 | ENSG00000232187 | NA |
| 5660 | This gene encodes a highly conserved preproprotein that is proteolytically processed to generate four main cleavage products including saposins A, B, C, and D. Each domain of the precursor protein is approximately 80 amino acid residues long with nearly identical placement of cysteine residues and glycosylation sites. Saposins A-D localize primarily to the lysosomal compartment where they facilitate the catabolism of glycosphingolipids with short oligosaccharide groups. The precursor protein exists both as a secretory protein and as an integral membrane protein and has neurotrophic activities. Mutations in this gene have been associated with Gaucher disease and metachromatic leukodystrophy. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that is proteolytically processed. | prosaposin | PSAP | ENSG00000197746 | NA |
| ENSG00000227201 | NA | calponin 2 pseudogene 1 | CNN2P1 | ENSG00000227201 | NA |
| 23052 | NA | endonuclease domain containing 1 | ENDOD1 | ENSG00000149218 | NA |
| ENSG00000261329 | NA | NA | CTD-2049O4.1 | ENSG00000261329 | NA |
| 25778 | This gene encodes a dual serine/threonine and tyrosine protein kinase which is expressed in multiple tissues. It is thought to function as a regulator of cell death. Multiple transcript variants encoding different isoforms have been found for this gene. | dual serine/threonine and tyrosine protein kinase | DSTYK | ENSG00000133059 | NA |
| ENSG00000261064 | NA | NA | RP11-1000B6.3 | ENSG00000261064 | NA |
| 11010 | This gene encodes a protein with similarity to both the pathogenesis-related protein (PR) superfamily and the cysteine-rich secretory protein (CRISP) family. Increased expression of this gene is associated with myelomocytic differentiation in macrophage and decreased expression of this gene through gene methylation is associated with prostate cancer. The protein has proapoptotic activities in prostate and bladder cancer cells. This gene is a member of a cluster on chromosome 12 containing two other similar genes. Alternatively spliced variants which encode different protein isoforms have been described; however, not all variants have been fully characterized. | GLI pathogenesis related 1 | GLIPR1 | ENSG00000139278 | NA |
| 2495 | This gene encodes the heavy subunit of ferritin, the major intracellular iron storage protein in prokaryotes and eukaryotes. It is composed of 24 subunits of the heavy and light ferritin chains. Variation in ferritin subunit composition may affect the rates of iron uptake and release in different tissues. A major function of ferritin is the storage of iron in a soluble and nontoxic state. Defects in ferritin proteins are associated with several neurodegenerative diseases. This gene has multiple pseudogenes. Several alternatively spliced transcript variants have been observed, but their biological validity has not been determined. | ferritin heavy chain 1 | FTH1 | ENSG00000167996 | NA |
| 51567 | This gene encodes a member of a superfamily of divalent cation-dependent phosphodiesterases. The encoded protein associates with CD40, tumor necrosis factor (TNF) receptor-75 and TNF receptor associated factors (TRAFs), and inhibits nuclear factor-kappa-B activation. This protein has sequence and structural similarities with APE1 endonuclease, which is involved in both DNA repair and the activation of transcription factors. | tyrosyl-DNA phosphodiesterase 2 | TDP2 | ENSG00000111802 | NA |
| 51094 | This gene encodes a protein which acts as a receptor for adiponectin, a hormone secreted by adipocytes which regulates fatty acid catabolism and glucose levels. Binding of adiponectin to the encoded protein results in activation of an AMP-activated kinase signaling pathway which affects levels of fatty acid oxidation and insulin sensitivity. A pseudogene of this gene is located on chromosome 14. Multiple alternatively spliced transcript variants have been found for this gene. | adiponectin receptor 1 | ADIPOR1 | ENSG00000159346 | NA |
| ENSG00000248223 | NA | NA | CTD-2139B15.2 | ENSG00000248223 | NA |
| ENSG00000267904 | NA | NA | CTC-429P9.5 | ENSG00000267904 | NA |
| 51528 | NA | JNK1/MAPK8-associated membrane protein | JKAMP | ENSG00000050130 | NA |
| 121274 | NA | zinc finger protein 641 | ZNF641 | ENSG00000167528 | NA |
| 1486 | Chitobiase is a lysosomal glycosidase involved in degradation of asparagine-linked oligosaccharides on glycoproteins (Aronson and Kuranda, 1989 [PubMed 2531691]). | chitobiase | CTBS | ENSG00000117151 | NA |
| 26224 | This gene encodes a member of the F-box protein family which is characterized by an approximately 40 amino acid motif, the F-box. The F-box proteins constitute one of the four subunits of ubiquitin protein ligase complex called SCFs (SKP1-cullin-F-box), which function in phosphorylation-dependent ubiquitination. The F-box proteins are divided into 3 classes: Fbws containing WD-40 domains, Fbls containing leucine-rich repeats, and Fbxs containing either different protein-protein interaction modules or no recognizable motifs. The protein encoded by this gene belongs to the Fbls class and, in addition to an F-box, contains several tandem leucine-rich repeats and is localized in the nucleus. | F-box and leucine rich repeat protein 3 | FBXL3 | ENSG00000005812 | NA |
| ENSG00000232909 | NA | NA | RP3-510O8.4 | ENSG00000232909 | NA |
| 253725 | NA | family with sequence similarity 21 member C | FAM21C | ENSG00000172661 | NA |
| 81545 | This gene encodes a large protein that contains an F-box domain and may participate in protein ubiquitination. The encoded protein is a transcriptional co-activator of Krueppel-like factor 7 (Klf7). A heterozygous mutation in this gene was found in individuals with autosomal dominant distal hereditary motor neuronopathy type IID. There is a pseudogene for this gene on chromosome 4. Alternative splicing results in multiple transcript variants. | F-box protein 38 | FBXO38 | ENSG00000145868 | NA |
| 10367 | This gene encodes an essential regulator of mitochondrial Ca2+ uptake under basal conditions. The encoded protein interacts with the mitochondrial calcium uniporter, a mitochondrial inner membrane Ca2+ channel, and is essential in preventing mitochondrial Ca2+ overload, which can cause excessive production of reactive oxygen species and cell stress. Alternatively spliced transcript variants encoding different isoforms have been described. | mitochondrial calcium uptake 1 | MICU1 | ENSG00000107745 | NA |
| 10970 | NA | cytoskeleton-associated protein 4 | CKAP4 | ENSG00000136026 | NA |
| 5681 | NA | protein serine kinase H1 | PSKH1 | ENSG00000159792 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",12,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[13,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| notfound | query | X_id | symbol | name | summary |
|---|---|---|---|---|---|
| TRUE | ENSG00000271738 | NA | NA | NA | NA |
| NA | ENSG00000272512 | ENSG00000272512 | RP11-54O7.17 | NA | NA |
| NA | ENSG00000172201 | 3400 | ID4 | inhibitor of DNA binding 4, HLH protein | This gene encodes a member of the inhibitor of DNA binding (ID) protein family. These proteins are basic helix-loop-helix transcription factors which can act as tumor suppressors but lack DNA binding activity. Consequently, the activity of the encoded protein depends on the protein binding partner. |
| NA | ENSG00000173175 | 111 | ADCY5 | adenylate cyclase 5 | This gene encodes a member of the membrane-bound adenylyl cyclase enzymes. Adenylyl cyclases mediate G protein-coupled receptor signaling through the synthesis of the second messenger cAMP. Activity of the encoded protein is stimulated by the Gs alpha subunit of G protein-coupled receptors and is inhibited by protein kinase A, calcium and Gi alpha subunits. Single nucleotide polymorphisms in this gene may be associated with low birth weight and type 2 diabetes. Alternatively spliced transcript variants that encode different isoforms have been observed for this gene. |
| NA | ENSG00000229732 | ENSG00000229732 | AC019349.5 | NA | NA |
| NA | ENSG00000122378 | 84293 | FAM213A | family with sequence similarity 213 member A | NA |
| NA | ENSG00000133401 | 23037 | PDZD2 | PDZ domain containing 2 | The protein encoded by this gene contains six PDZ domains and shares sequence similarity with pro-interleukin-16 (pro-IL-16). Like pro-IL-16, the encoded protein localizes to the endoplasmic reticulum and is thought to be cleaved by a caspase to produce a secreted peptide containing two PDZ domains. In addition, this gene is upregulated in primary prostate tumors and may be involved in the early stages of prostate tumorigenesis. |
| NA | ENSG00000198300 | 5178 | PEG3 | paternally expressed 3 | In human, ZIM2 and PEG3 are treated as two distinct genes though they share multiple 5’ exons and a common promoter and both genes are paternally expressed (PMID:15203203). Alternative splicing events connect their shared 5’ exons either with the remaining 4 exons unique to ZIM2, or with the remaining 2 exons unique to PEG3. In contrast, in other mammals ZIM2 does not undergo imprinting and, in mouse, cow, and likely other mammals as well, the ZIM2 and PEG3 genes do not share exons. Human PEG3 protein belongs to the Kruppel C2H2-type zinc finger protein family. PEG3 may play a role in cell proliferation and p53-mediated apoptosis. PEG3 has also shown tumor suppressor activity and tumorigenesis in glioma and ovarian cells. Alternative splicing of this PEG3 gene results in multiple transcript variants encoding distinct isoforms. |
| NA | ENSG00000157404 | 3815 | KIT | KIT proto-oncogene receptor tyrosine kinase | This gene encodes the human homolog of the proto-oncogene c-kit. C-kit was first identified as the cellular homolog of the feline sarcoma viral oncogene v-kit. This protein is a type 3 transmembrane receptor for MGF (mast cell growth factor, also known as stem cell factor). Mutations in this gene are associated with gastrointestinal stromal tumors, mast cell disease, acute myelogenous lukemia, and piebaldism. Multiple transcript variants encoding different isoforms have been found for this gene. |
| NA | ENSG00000105088 | 93145 | OLFM2 | olfactomedin 2 | NA |
| NA | ENSG00000160145 | 8997 | KALRN | kalirin, RhoGEF kinase | Huntington’s disease (HD), a neurodegenerative disorder characterized by loss of striatal neurons, is caused by an expansion of a polyglutamine tract in the HD protein huntingtin. This gene encodes a protein that interacts with the huntingtin-associated protein 1, which is a huntingtin binding protein that may function in vesicle trafficking. |
| NA | ENSG00000272678 | ENSG00000272678 | RP11-797D24.4 | NA | NA |
| NA | ENSG00000163485 | 134 | ADORA1 | adenosine A1 receptor | The protein encoded by this gene is an adenosine receptor that belongs to the G-protein coupled receptor 1 family. There are 3 types of adenosine receptors, each with a specific pattern of ligand binding and tissue distribution, and together they regulate a diverse set of physiologic functions. The type A1 receptors inhibit adenylyl cyclase, and play a role in the fertilization process. Animal studies also suggest a role for A1 receptors in kidney function and ethanol intoxication. Transcript variants with alternative splicing in the 5’ UTR have been found for this gene. |
| NA | ENSG00000173898 | 6712 | SPTBN2 | spectrin beta, non-erythrocytic 2 | Spectrins are principle components of a cell’s membrane-cytoskeleton and are composed of two alpha and two beta spectrin subunits. The protein encoded by this gene (SPTBN2), is called spectrin beta non-erythrocytic 2 or beta-III spectrin. It is related to, but distinct from, the beta-II spectrin gene which is also known as spectrin beta non-erythrocytic 1 (SPTBN1). SPTBN2 regulates the glutamate signaling pathway by stabilizing the glutamate transporter EAAT4 at the surface of the plasma membrane. Mutations in this gene cause a form of spinocerebellar ataxia, SCA5, that is characterized by neurodegeneration, progressive locomotor incoordination, dysarthria, and uncoordinated eye movements. |
| NA | ENSG00000171766 | 2628 | GATM | glycine amidinotransferase | This gene encodes a mitochondrial enzyme that belongs to the amidinotransferase family. This enzyme is involved in creatine biosynthesis, whereby it catalyzes the transfer of a guanido group from L-arginine to glycine, resulting in guanidinoacetic acid, the immediate precursor of creatine. Mutations in this gene cause arginine:glycine amidinotransferase deficiency, an inborn error of creatine synthesis characterized by mental retardation, language impairment, and behavioral disorders. |
| NA | ENSG00000025039 | 58528 | RRAGD | Ras related GTP binding D | RRAGD is a monomeric guanine nucleotide-binding protein, or G protein. By binding GTP or GDP, small G proteins act as molecular switches in numerous cell processes and signaling pathways. |
| NA | ENSG00000182902 | 83733 | SLC25A18 | solute carrier family 25 member 18 | NA |
| NA | ENSG00000103034 | 65009 | NDRG4 | NDRG family member 4 | This gene is a member of the N-myc downregulated gene family which belongs to the alpha/beta hydrolase superfamily. The protein encoded by this gene is a cytoplasmic protein that is required for cell cycle progression and survival in primary astrocytes and may be involved in the regulation of mitogenic signalling in vascular smooth muscles cells. Alternative splicing results in multiple transcripts encoding different isoforms. |
| NA | ENSG00000149809 | 7108 | TM7SF2 | transmembrane 7 superfamily member 2 | NA |
| NA | ENSG00000162373 | 79656 | BEND5 | BEN domain containing 5 | NA |
| NA | ENSG00000121690 | 91614 | DEPDC7 | DEP domain containing 7 | NA |
| NA | ENSG00000163209 | 6707 | SPRR3 | small proline rich protein 3 | NA |
| NA | ENSG00000260244 | ENSG00000260244 | RP11-588K22.2 | NA | NA |
| NA | ENSG00000136237 | 9771 | RAPGEF5 | Rap guanine nucleotide exchange factor 5 | Members of the RAS (see HRAS; MIM 190020) subfamily of GTPases function in signal transduction as GTP/GDP-regulated switches that cycle between inactive GDP- and active GTP-bound states. Guanine nucleotide exchange factors (GEFs), such as RAPGEF5, serve as RAS activators by promoting acquisition of GTP to maintain the active GTP-bound state and are the key link between cell surface receptors and RAS activation (Rebhun et al., 2000 [PubMed 10934204]). |
| NA | ENSG00000204677 | ENSG00000204677 | FAM153C | family with sequence similarity 153 member C | NA |
| NA | ENSG00000236609 | 54753 | ZNF853 | zinc finger protein 853 | NA |
| NA | ENSG00000169509 | 54544 | CRCT1 | cysteine rich C-terminal 1 | NA |
| NA | ENSG00000163864 | 349565 | NMNAT3 | nicotinamide nucleotide adenylyltransferase 3 | This gene encodes a member of the nicotinamide/nicotinic acid mononucleotide adenylyltransferase family. These enzymes use ATP to catalyze the synthesis of nicotinamide adenine dinucleotide or nicotinic acid adenine dinucleotide from nicotinamide mononucleotide or nicotinic acid mononucleotide, respectively. The encoded protein is localized to mitochondria and may also play a neuroprotective role as a molecular chaperone. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. |
| TRUE | ENSG00000268358 | NA | NA | NA | NA |
| NA | ENSG00000125378 | 652 | BMP4 | bone morphogenetic protein 4 | This gene encodes a member of the bone morphogenetic protein (BMP) family of proteins, which is part of the transforming growth factor-beta (TGF-beta) superfamily. Members of the BMP family play an important role in bone and cartilage development. The encoded preproprotein is proteolytically processed to generate each subunit of the disulfide-linked homodimer. Mutations in this gene are associated with orofacial cleft and microphthalmia in human patients. The encoded protein may also be involved in the pathology of multiple cardiovascular diseases and human cancers. Alternative splicing results in multiple transcript variants. |
| NA | ENSG00000125780 | 7053 | TGM3 | transglutaminase 3 | Transglutaminases are enzymes that catalyze the crosslinking of proteins by epsilon-gamma glutamyl lysine isopeptide bonds. While the primary structure of transglutaminases is not conserved, they all have the same amino acid sequence at their active sites and their activity is calcium-dependent. The protein encoded by this gene consists of two polypeptide chains activated from a single precursor protein by proteolysis. The encoded protein is involved the later stages of cell envelope formation in the epidermis and hair follicle. |
| NA | ENSG00000182985 | 23705 | CADM1 | cell adhesion molecule 1 | NA |
| NA | ENSG00000164116 | 2982 | GUCY1A3 | guanylate cyclase 1, soluble, alpha 3 | Soluble guanylate cyclases are heterodimeric proteins that catalyze the conversion of GTP to 3’,5’-cyclic GMP and pyrophosphate. The protein encoded by this gene is an alpha subunit of this complex and it interacts with a beta subunit to form the guanylate cyclase enzyme, which is activated by nitric oxide. Several transcript variants encoding a few different isoforms have been found for this gene. |
| NA | ENSG00000259933 | ENSG00000259933 | RP11-304L19.1 | NA | NA |
| NA | ENSG00000182230 | 202134 | FAM153B | family with sequence similarity 153 member B | NA |
| NA | ENSG00000182230 | 100507387 | LOC100507387 | uncharacterized LOC100507387 | NA |
| NA | ENSG00000186998 | 129080 | EMID1 | EMI domain containing 1 | NA |
| NA | ENSG00000106772 | 158471 | PRUNE2 | prune homolog 2 | The protein encoded by this gene belongs to the B-cell CLL/lymphoma 2 and adenovirus E1B 19 kDa interacting family, whose members play roles in many cellular processes including apotosis, cell transformation, and synaptic function. Several functions for this protein have been demonstrated including suppression of Ras homolog family member A activity, which results in reduced stress fiber formation and suppression of oncogenic cellular transformation. A high molecular weight isoform of this protein has also been shown to colocalize with Adaptor protein complex 2, beta-Adaptin and endodermal markers, suggesting an involvement in post-endocytic trafficking. In prostate cancer cells, this gene acts as a tumor suppressor and its expression is regulated by prostate cancer antigen 3, a non-protein coding gene on the opposite DNA strand in an intron of this gene. Prostate cancer antigen 3 regulates levels of this gene through formation of a double-stranded RNA that undergoes adenosine deaminase actin on RNA-dependent adenosine-to-inosine RNA editing. Alternative splicing results in multiple transcript variants. |
| NA | ENSG00000124225 | 56937 | PMEPA1 | prostate transmembrane protein, androgen induced 1 | This gene encodes a transmembrane protein that contains a Smad interacting motif (SIM). Expression of this gene is induced by androgens and transforming growth factor beta, and the encoded protein suppresses the androgen receptor and transforming growth factor beta signaling pathways though interactions with Smad proteins. Overexpression of this gene may play a role in multiple types of cancer. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. |
| NA | ENSG00000134121 | 10752 | CHL1 | cell adhesion molecule L1 like | The protein encoded by this gene is a member of the L1 gene family of neural cell adhesion molecules. It is a neural recognition molecule that may be involved in signal transduction pathways. The deletion of one copy of this gene may be responsible for mental defects in patients with 3p- syndrome. This protein may also play a role in the growth of certain cancers. Alternate splicing results in both coding and non-coding variants. |
| NA | ENSG00000135423 | 27165 | GLS2 | glutaminase 2 | The protein encoded by this gene is a mitochondrial phosphate-activated glutaminase that catalyzes the hydrolysis of glutamine to stoichiometric amounts of glutamate and ammonia. Originally thought to be liver-specific, this protein has been found in other tissues as well. Alternative splicing results in multiple transcript variants that encode different isoforms. |
| NA | ENSG00000154330 | 5239 | PGM5 | phosphoglucomutase 5 | Phosphoglucomutases (EC 5.2.2.2.), such as PGM5, are phosphotransferases involved in interconversion of glucose-1-phosphate and glucose-6-phosphate. PGM activity is essential in formation of carbohydrates from glucose-6-phosphate and in formation of glucose-6-phosphate from galactose and glycogen (Edwards et al., 1995 [PubMed 8586438]). |
| NA | ENSG00000140451 | 80119 | PIF1 | PIF1 5’-to-3’ DNA helicase | This gene encodes a DNA-dependent adenosine triphosphate (ATP)-metabolizing enzyme that functions as a 5’ to 3’ DNA helicase. The encoded protein can resolve G-quadruplex structures and RNA-DNA hybrids at the ends of chromosomes. It also prevents telomere elongation by inhibiting the actions of telomerase. Alternative splicing and the use of alternative start codons results in multiple isoforms that are differentially localized to either the mitochondria or the nucleus. |
| NA | ENSG00000142178 | 150094 | SIK1 | salt inducible kinase 1 | NA |
| NA | ENSG00000130787 | 9026 | HIP1R | huntingtin interacting protein 1 related | NA |
| NA | ENSG00000130702 | 3911 | LAMA5 | laminin subunit alpha 5 | This gene encodes one of the vertebrate laminin alpha chains. Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Laminins are composed of 3 non identical chains: laminin alpha, beta and gamma (formerly A, B1, and B2, respectively) and they form a cruciform structure consisting of 3 short arms, each formed by a different chain, and a long arm composed of all 3 chains. Each laminin chain is a multidomain protein encoded by a distinct gene. The protein encoded by this gene is the alpha-5 subunit of of laminin-10 (laminin-511), laminin-11 (laminin-521) and laminin-15 (laminin-523). |
| NA | ENSG00000008710 | 5310 | PKD1 | polycystin 1, transient receptor potential channel interacting | This gene encodes a member of the polycystin protein family. The encoded glycoprotein contains a large N-terminal extracellular region, multiple transmembrane domains and a cytoplasmic C-tail. It is an integral membrane protein that functions as a regulator of calcium permeable cation channels and intracellular calcium homoeostasis. It is also involved in cell-cell/matrix interactions and may modulate G-protein-coupled signal-transduction pathways. It plays a role in renal tubular development, and mutations in this gene cause autosomal dominant polycystic kidney disease type 1 (ADPKD1). ADPKD1 is characterized by the growth of fluid-filled cysts that replace normal renal tissue and result in end-stage renal failure. Splice variants encoding different isoforms have been noted for this gene. Also, six pseudogenes, closely linked in a known duplicated region on chromosome 16p, have been described. |
| NA | ENSG00000205336 | 9289 | ADGRG1 | adhesion G protein-coupled receptor G1 | This gene encodes a member of the G protein-coupled receptor family and regulates brain cortical patterning. The encoded protein binds specifically to transglutaminase 2, a component of tissue and tumor stroma implicated as an inhibitor of tumor progression. Mutations in this gene are associated with a brain malformation known as bilateral frontoparietal polymicrogyria. Alternative splicing results in multiple transcript variants. |
| NA | ENSG00000169758 | 123591 | TMEM266 | transmembrane protein 266 | NA |
| TRUE | ENSG00000184674 | NA | NA | NA | NA |
| NA | ENSG00000169116 | 25849 | PARM1 | prostate androgen-regulated mucin-like protein 1 | NA |
| NA | ENSG00000188732 | 340277 | FAM221A | family with sequence similarity 221 member A | NA |
| NA | ENSG00000101096 | 4773 | NFATC2 | nuclear factor of activated T-cells 2 | This gene is a member of the nuclear factor of activated T cells (NFAT) family. The product of this gene is a DNA-binding protein with a REL-homology region (RHR) and an NFAT-homology region (NHR). This protein is present in the cytosol and only translocates to the nucleus upon T cell receptor (TCR) stimulation, where it becomes a member of the nuclear factors of activated T cells transcription complex. This complex plays a central role in inducing gene transcription during the immune response. Alternate transcriptional splice variants encoding different isoforms have been characterized. |
| NA | ENSG00000231584 | ENSG00000231584 | FAHD2CP | fumarylacetoacetate hydrolase domain containing 2C, pseudogene | NA |
| NA | ENSG00000171772 | 93426 | SYCE1 | synaptonemal complex central element protein 1 | NA |
| NA | ENSG00000171401 | 3860 | KRT13 | keratin 13 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. |
| NA | ENSG00000039068 | 999 | CDH1 | cadherin 1 | This gene encodes a classical cadherin of the cadherin superfamily. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature glycoprotein. This calcium-dependent cell-cell adhesion protein is comprised of five extracellular cadherin repeats, a transmembrane region and a highly conserved cytoplasmic tail. Mutations in this gene are correlated with gastric, breast, colorectal, thyroid and ovarian cancer. Loss of function of this gene is thought to contribute to cancer progression by increasing proliferation, invasion, and/or metastasis. The ectodomain of this protein mediates bacterial adhesion to mammalian cells and the cytoplasmic domain is required for internalization. This gene is present in a gene cluster with other members of the cadherin family on chromosome 16. |
| NA | ENSG00000136848 | 153090 | DAB2IP | DAB2 interacting protein | DAB2IP is a Ras (MIM 190020) GTPase-activating protein (GAP) that acts as a tumor suppressor. The DAB2IP gene is inactivated by methylation in prostate and breast cancers (Yano et al., 2005 [PubMed 15386433]). |
| NA | ENSG00000183779 | 80139 | ZNF703 | zinc finger protein 703 | NA |
| NA | ENSG00000152583 | 8404 | SPARCL1 | SPARC like 1 | NA |
| NA | ENSG00000120278 | 57480 | PLEKHG1 | pleckstrin homology and RhoGEF domain containing G1 | NA |
| NA | ENSG00000099282 | 23555 | TSPAN15 | tetraspanin 15 | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. The use of alternate polyadenylation sites has been found for this gene. |
| NA | ENSG00000198719 | 28514 | DLL1 | delta like canonical Notch ligand 1 | DLL1 is a human homolog of the Notch Delta ligand and is a member of the delta/serrate/jagged family. It plays a role in mediating cell fate decisions during hematopoiesis. It may play a role in cell-to-cell communication. |
| NA | ENSG00000101447 | 81610 | FAM83D | family with sequence similarity 83 member D | NA |
| TRUE | ENSG00000257026 | NA | NA | NA | NA |
| NA | ENSG00000136002 | 50649 | ARHGEF4 | Rho guanine nucleotide exchange factor 4 | Rho GTPases play a fundamental role in numerous cellular processes that are initiated by extracellular stimuli that work through G protein coupled receptors. The protein encoded by this gene may form complex with G proteins and stimulate Rho-dependent signals. Multiple alternatively spliced transcript variants encoding different isoforms have been found, but the full-length nature of some variants has not been determined. |
| NA | ENSG00000135744 | 183 | AGT | angiotensinogen | The protein encoded by this gene, pre-angiotensinogen or angiotensinogen precursor, is expressed in the liver and is cleaved by the enzyme renin in response to lowered blood pressure. The resulting product, angiotensin I, is then cleaved by angiotensin converting enzyme (ACE) to generate the physiologically active enzyme angiotensin II. The protein is involved in maintaining blood pressure and in the pathogenesis of essential hypertension and preeclampsia. Mutations in this gene are associated with susceptibility to essential hypertension, and can cause renal tubular dysgenesis, a severe disorder of renal tubular development. Defects in this gene have also been associated with non-familial structural atrial fibrillation, and inflammatory bowel disease. |
| NA | ENSG00000154721 | 58494 | JAM2 | junctional adhesion molecule 2 | This gene belongs to the immunoglobulin superfamily, and the junctional adhesion molecule (JAM) family. The protein encoded by this gene is a type I membrane protein that is localized at the tight junctions of both epithelial and endothelial cells. It acts as an adhesive ligand for interacting with a variety of immune cell types, and may play a role in lymphocyte homing to secondary lymphoid organs. Alternatively spliced transcript variants have been found for this gene. |
| NA | ENSG00000167191 | 51704 | GPRC5B | G protein-coupled receptor class C group 5 member B | This gene encodes a member of the type 3 G protein-coupled receptor family. Members of this superfamily are characterized by a signature 7-transmembrane domain motif. The encoded protein may modulate insulin secretion and increased protein expression is associated with type 2 diabetes. Alternative splicing results in multiple transcript variants. |
| NA | ENSG00000206535 | 348801 | LNP1 | leukemia NUP98 fusion partner 1 | NA |
| NA | ENSG00000137269 | 55227 | LRRC1 | leucine rich repeat containing 1 | NA |
| NA | ENSG00000108852 | 4355 | MPP2 | membrane palmitoylated protein 2 | Palmitoylated membrane protein 2 is a member of a family of membrane-associated proteins termed MAGUKs (membrane-associated guanylate kinase homologs). MAGUKs interact with the cytoskeleton and regulate cell proliferation, signaling pathways, and intracellular junctions. Palmitoylated membrane protein 2 contains a conserved sequence, called the SH3 (src homology 3) motif, found in several other proteins that associate with the cytoskeleton and are suspected to play important roles in signal transduction. |
| NA | ENSG00000271218 | ENSG00000271218 | RP3-523E19.2 | NA | NA |
| NA | ENSG00000229953 | ENSG00000229953 | RP11-284F21.7 | NA | NA |
| NA | ENSG00000268751 | 643719 | SCGB1B2P | secretoglobin family 1B member 2, pseudogene | NA |
| NA | ENSG00000179057 | 283284 | IGSF22 | immunoglobulin superfamily member 22 | NA |
| NA | ENSG00000186260 | 57496 | MKL2 | MKL1/myocardin like 2 | NA |
| NA | ENSG00000081803 | 93664 | CADPS2 | Ca2+ dependent secretion activator 2 | This gene encodes a member of the calcium-dependent activator of secretion (CAPS) protein family, which are calcium binding proteins that regulate the exocytosis of synaptic and dense-core vesicles in neurons and neuroendocrine cells. Mutations in this gene may contribute to autism susceptibility. Multiple transcript variants encoding different isoforms have been found for this gene. |
| NA | ENSG00000272468 | ENSG00000272468 | RP1-86C11.7 | NA | NA |
| NA | ENSG00000003147 | 3382 | ICA1 | islet cell autoantigen 1 | This gene encodes a protein with an arfaptin homology domain that is found both in the cytosol and as membrane-bound form on the Golgi complex and immature secretory granules. This protein is believed to be an autoantigen in insulin-dependent diabetes mellitus and primary Sjogren’s syndrome. Several transcript variants encoding two different isoforms have been found for this gene. |
| NA | ENSG00000111879 | 79632 | FAM184A | family with sequence similarity 184 member A | NA |
| NA | ENSG00000156968 | 255027 | MPV17L | MPV17 mitochondrial inner membrane protein like | NA |
| NA | ENSG00000161835 | 160622 | GRASP | GRP1 (general receptor for phosphoinositides 1)-associated scaffold protein | This gene encodes a protein that functions as a molecular scaffold, linking receptors, including group 1 metabotropic glutamate receptors, to neuronal proteins. The encoded protein contains conserved domains, including a leucine zipper sequence, PDZ domain and a C-terminal PDZ-binding motif. Alternately spliced transcript variants have been observed for this gene. |
| NA | ENSG00000156113 | 3778 | KCNMA1 | potassium calcium-activated channel subfamily M alpha 1 | MaxiK channels are large conductance, voltage and calcium-sensitive potassium channels which are fundamental to the control of smooth muscle tone and neuronal excitability. MaxiK channels can be formed by 2 subunits: the pore-forming alpha subunit, which is the product of this gene, and the modulatory beta subunit. Intracellular calcium regulates the physical association between the alpha and beta subunits. Alternatively spliced transcript variants encoding different isoforms have been identified. |
| NA | ENSG00000162438 | 11330 | CTRC | chymotrypsin C | This gene encodes a member of the peptidase S1 family. The encoded protein is a serum calcium-decreasing factor that has chymotrypsin-like protease activity. Alternatively spliced transcript variants have been observed, but their full-length nature has not been determined. |
| NA | ENSG00000063180 | 770 | CA11 | carbonic anhydrase 11 | Carbonic anhydrases (CAs) are a large family of zinc metalloenzymes that catalyze the reversible hydration of carbon dioxide. They participate in a variety of biological processes, including respiration, calcification, acid-base balance, bone resorption, and the formation of aqueous humor, cerebrospinal fluid, saliva, and gastric acid. They show extensive diversity in tissue distribution and in their subcellular localization. CA XI is likely a secreted protein, however, radical changes at active site residues completely conserved in CA isozymes with catalytic activity, make it unlikely that it has carbonic anhydrase activity. It shares properties in common with two other acatalytic CA isoforms, CA VIII and CA X. CA XI is most abundantly expressed in brain, and may play a general role in the central nervous system. |
| NA | ENSG00000172264 | 140733 | MACROD2 | MACRO domain containing 2 | NA |
| NA | ENSG00000186352 | 353322 | ANKRD37 | ankyrin repeat domain 37 | NA |
| NA | ENSG00000136160 | 1910 | EDNRB | endothelin receptor type B | The protein encoded by this gene is a G protein-coupled receptor which activates a phosphatidylinositol-calcium second messenger system. Its ligand, endothelin, consists of a family of three potent vasoactive peptides: ET1, ET2, and ET3. Studies suggest that the multigenic disorder, Hirschsprung disease type 2, is due to mutations in the endothelin receptor type B gene. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. |
| NA | ENSG00000129595 | 64097 | EPB41L4A | erythrocyte membrane protein band 4.1 like 4A | Members of the band 4.1 protein superfamily, including EPB41L4A, are thought to regulate the interaction between the cytoskeleton and plasma membrane (Ishiguro et al., 2000 [PubMed 10874211]). |
| NA | ENSG00000215845 | 100131187 | TSTD1 | thiosulfate sulfurtransferase like domain containing 1 | NA |
| NA | ENSG00000106123 | 2051 | EPHB6 | EPH receptor B6 | This gene encodes a member of a family of transmembrane proteins that function as receptors for ephrin-B family proteins. Unlike other members of this family, the encoded protein does not contain a functional kinase domain. Activity of this protein can influence cell adhesion and migration. Expression of this gene is downregulated during tumor progression, suggesting that the protein may suppress tumor invasion and metastasis. Alternative splicing results in multiple transcript variants. |
| NA | ENSG00000188763 | 8326 | FZD9 | frizzled class receptor 9 | Members of the ‘frizzled’ gene family encode 7-transmembrane domain proteins that are receptors for Wnt signaling proteins. The FZD9 gene is located within the Williams syndrome common deletion region of chromosome 7, and heterozygous deletion of the FZD9 gene may contribute to the Williams syndrome phenotype. FZD9 is expressed predominantly in brain, testis, eye, skeletal muscle, and kidney. |
| NA | ENSG00000143536 | 49860 | CRNN | cornulin | This gene encodes a member of the ‘fused gene’ family of proteins, which contain N-terminus EF-hand domains and multiple tandem peptide repeats. The encoded protein contains two EF-hand Ca2+ binding domains in its N-terminus and two glutamine- and threonine-rich 60 amino acid repeats in its C-terminus. This gene, also known as squamous epithelial heat shock protein 53, may play a role in the mucosal/epithelial immune response and epidermal differentiation. |
| NA | ENSG00000072201 | 84708 | LNX1 | ligand of numb-protein X 1 | This gene encodes a membrane-bound protein that is involved in signal transduction and protein interactions. The encoded product is an E3 ubiquitin-protein ligase, which mediates ubiquitination and subsequent proteasomal degradation of proteins containing phosphotyrosine binding (PTB) domains. This protein may play an important role in tumorogenesis. Alternatively spliced transcript variants encoding distinct isoforms have been described. A pseudogene, which is located on chromosome 17, has been identified for this gene. |
| NA | ENSG00000171159 | 79095 | C9orf16 | chromosome 9 open reading frame 16 | NA |
| NA | ENSG00000183049 | 57118 | CAMK1D | calcium/calmodulin dependent protein kinase ID | This gene is a member of the calcium/calmodulin-dependent protein kinase 1 family, a subfamily of the serine/threonine kinases. The encoded protein is a component of the calcium-regulated calmodulin-dependent protein kinase cascade. It has been associated with multiple processes including regulation of granulocyte function, activation of CREB-dependent gene transcription, aldosterone synthesis, differentiation and activation of neutrophil cells, and apoptosis of erythroleukemia cells. Alternatively spliced transcript variants encoding different isoforms of this gene have been described. |
| NA | ENSG00000076554 | 7163 | TPD52 | tumor protein D52 | NA |
| NA | ENSG00000188779 | 390598 | SKOR1 | SKI family transcriptional corepressor 1 | NA |
| NA | ENSG00000162772 | 467 | ATF3 | activating transcription factor 3 | This gene encodes a member of the mammalian activation transcription factor/cAMP responsive element-binding (CREB) protein family of transcription factors. This gene is induced by a variety of signals, including many of those encountered by cancer cells, and is involved in the complex process of cellular stress response. Multiple transcript variants encoding different isoforms have been found for this gene. It is possible that alternative splicing of this gene may be physiologically important in the regulation of target genes. |
| NA | ENSG00000261113 | ENSG00000261113 | RP11-141O15.1 | NA | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",13,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[14,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| X_id | symbol | summary | query | name | notfound |
|---|---|---|---|---|---|
| 5967 | REG1A | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | ENSG00000115386 | regenerating family member 1 alpha | NA |
| NA | NA | NA | ENSG00000165862 | NA | TRUE |
| 22943 | DKK1 | This gene encodes a protein that is a member of the dickkopf family. It is a secreted protein with two cysteine rich regions and is involved in embryonic development through its inhibition of the WNT signaling pathway. Elevated levels of DKK1 in bone marrow plasma and peripheral blood is associated with the presence of osteolytic bone lesions in patients with multiple myeloma. | ENSG00000107984 | dickkopf WNT signaling pathway inhibitor 1 | NA |
| 123036 | TC2N | NA | ENSG00000165929 | tandem C2 domains, nuclear | NA |
| 132299 | OCIAD2 | NA | ENSG00000145247 | OCIA domain containing 2 | NA |
| 9874 | TLK1 | The protein encoded by this gene is a serine/threonine kinase that may be involved in the regulation of chromatin assembly. The encoded protein is only active when it is phosphorylated, and this phosphorylation is cell cycle-dependent, with the maximal activity of this protein coming during S phase. The catalytic activity of this protein is diminished by DNA damage and by blockage of DNA replication. Three transcript variants encoding different isoforms have been found for this gene. | ENSG00000198586 | tousled like kinase 1 | NA |
| 5721 | PSME2 | The 26S proteasome is a multicatalytic proteinase complex with a highly ordered structure composed of 2 complexes, a 20S core and a 19S regulator. The 20S core is composed of 4 rings of 28 non-identical subunits; 2 rings are composed of 7 alpha subunits and 2 rings are composed of 7 beta subunits. The 19S regulator is composed of a base, which contains 6 ATPase subunits and 2 non-ATPase subunits, and a lid, which contains up to 10 non-ATPase subunits. Proteasomes are distributed throughout eukaryotic cells at a high concentration and cleave peptides in an ATP/ubiquitin-dependent process in a non-lysosomal pathway. An essential function of a modified proteasome, the immunoproteasome, is the processing of class I MHC peptides. The immunoproteasome contains an alternate regulator, referred to as the 11S regulator or PA28, that replaces the 19S regulator. Three subunits (alpha, beta and gamma) of the 11S regulator have been identified. This gene encodes the beta subunit of the 11S regulator, one of the two 11S subunits that is induced by gamma-interferon. Three beta and three alpha subunits combine to form a heterohexameric ring. Six pseudogenes have been identified on chromosomes 4, 5, 8, 10 and 13. | ENSG00000100911 | proteasome activator subunit 2 | NA |
| 5644 | PRSS1 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | ENSG00000204983 | protease, serine 1 | NA |
| 939 | CD27 | The protein encoded by this gene is a member of the TNF-receptor superfamily. This receptor is required for generation and long-term maintenance of T cell immunity. It binds to ligand CD70, and plays a key role in regulating B-cell activation and immunoglobulin synthesis. This receptor transduces signals that lead to the activation of NF-kappaB and MAPK8/JNK. Adaptor proteins TRAF2 and TRAF5 have been shown to mediate the signaling process of this receptor. CD27-binding protein (SIVA), a proapoptotic protein, can bind to this receptor and is thought to play an important role in the apoptosis induced by this receptor. | ENSG00000139193 | CD27 molecule | NA |
| 122618 | PLD4 | NA | ENSG00000166428 | phospholipase D family member 4 | NA |
| 23403 | FBXO46 | Members of the F-box protein family, such as FBXO46, are characterized by an approximately 40-amino acid F-box motif. SCF complexes, formed by SKP1 (MIM 601434), cullin (see CUL1; MIM 603134), and F-box proteins, act as protein-ubiquitin ligases. F-box proteins interact with SKP1 through the F box, and they interact with ubiquitination targets through other protein interaction domains (Jin et al., 2004 [PubMed 15520277]). | ENSG00000177051 | F-box protein 46 | NA |
| ENSG00000233849 | AC022201.5 | NA | ENSG00000233849 | NA | NA |
| 6119 | RPA3 | NA | ENSG00000106399 | replication protein A3 | NA |
| 64393 | ZMAT3 | This gene encodes a protein containing three zinc finger domains and a nuclear localization signal. The mRNA and the protein of this gene are upregulated by wildtype p53 and overexpression of this gene inhibits tumor cell growth, suggesting that this gene may have a role in the p53-dependent growth regulatory pathway. Alternative splicing of this gene results in two transcript variants encoding two isoforms differing in only one amino acid. | ENSG00000172667 | zinc finger matrin-type 3 | NA |
| 163786 | SASS6 | SAS6 is necessary for centrosome duplication and functions during procentriole formation; SAS6 functions to ensure that each centriole seeds the formation of a single procentriole per cell cycle Strnad et al., (2007) [PubMed 17681132]. | ENSG00000156876 | SAS-6 centriolar assembly protein | NA |
| 117584 | RFFL | NA | ENSG00000092871 | ring finger and FYVE-like domain containing E3 ubiquitin protein ligase | NA |
| 34 | ACADM | This gene encodes the medium-chain specific (C4 to C12 straight chain) acyl-Coenzyme A dehydrogenase. The homotetramer enzyme catalyzes the initial step of the mitochondrial fatty acid beta-oxidation pathway. Defects in this gene cause medium-chain acyl-CoA dehydrogenase deficiency, a disease characterized by hepatic dysfunction, fasting hypoglycemia, and encephalopathy, which can result in infantile death. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ENSG00000117054 | acyl-CoA dehydrogenase, C-4 to C-12 straight chain | NA |
| ENSG00000237950 | RP11-7O11.3 | NA | ENSG00000237950 | NA | NA |
| ENSG00000183444 | OR7E38P | NA | ENSG00000183444 | olfactory receptor family 7 subfamily E member 38 pseudogene | NA |
| 11179 | ZNF277 | NA | ENSG00000198839 | zinc finger protein 277 | NA |
| 6772 | STAT1 | The protein encoded by this gene is a member of the STAT protein family. In response to cytokines and growth factors, STAT family members are phosphorylated by the receptor associated kinases, and then form homo- or heterodimers that translocate to the cell nucleus where they act as transcription activators. This protein can be activated by various ligands including interferon-alpha, interferon-gamma, EGF, PDGF and IL6. This protein mediates the expression of a variety of genes, which is thought to be important for cell viability in response to different cell stimuli and pathogens. Two alternatively spliced transcript variants encoding distinct isoforms have been described. | ENSG00000115415 | signal transducer and activator of transcription 1 | NA |
| ENSG00000230177 | RP5-1112D6.4 | NA | ENSG00000230177 | NA | NA |
| 1611 | DAP | This gene encodes a basic, proline-rich, 15-kD protein. The protein acts as a positive mediator of programmed cell death that is induced by interferon-gamma. Alternatively spliced transcript variants encoding distinct isoforms have been found for this gene. | ENSG00000112977 | death-associated protein | NA |
| 55349 | CHDH | The protein encoded by this gene is a choline dehydrogenase that localizes to the mitochondrion. Variations in this gene can affect susceptibility to choline deficiency. A few transcript variants have been found for this gene, but the full-length nature of only one has been characterized to date. | ENSG00000016391 | choline dehydrogenase | NA |
| 3978 | LIG1 | This gene encodes a member of the ATP-dependent DNA ligase protein family. The encoded protein functions in DNA replication, recombination, and the base excision repair process. Mutations in this gene that lead to DNA ligase I deficiency result in immunodeficiency and increased sensitivity to DNA-damaging agents. Disruption of this gene may also be associated with a variety of cancers. Alternative splicing results in multiple transcript variants. | ENSG00000105486 | DNA ligase 1 | NA |
| 11019 | LIAS | The protein encoded by this gene belongs to the biotin and lipoic acid synthetases family. It localizes in mitochondrion and plays an important role in alpha-(+)-lipoic acid synthesis. It may also function in the sulfur insertion chemistry in lipoate biosynthesis. Alternative splicing occurs at this locus and two transcript variants encoding distinct isoforms have been identified. | ENSG00000121897 | lipoic acid synthetase | NA |
| 221294 | NT5DC1 | While the exact function of the protein encoded by this gene is not known, it belongs to the 5’(3’)-deoxyribonucleotidase family. | ENSG00000178425 | 5’-nucleotidase domain containing 1 | NA |
| ENSG00000203644 | RP11-332M2.1 | NA | ENSG00000203644 | NA | NA |
| 55166 | CENPQ | CENPQ is a subunit of a CENPH (MIM 605607)-CENPI (MIM 300065)-associated centromeric complex that targets CENPA (MIM 117139) to centromeres and is required for proper kinetochore function and mitotic progression (Okada et al., 2006 [PubMed 16622420]). | ENSG00000031691 | centromere protein Q | NA |
| 84333 | PCGF5 | NA | ENSG00000180628 | polycomb group ring finger 5 | NA |
| 2222 | FDFT1 | This gene encodes a membrane-associated enzyme located at a branch point in the mevalonate pathway. The encoded protein is the first specific enzyme in cholesterol biosynthesis, catalyzing the dimerization of two molecules of farnesyl diphosphate in a two-step reaction to form squalene. | ENSG00000079459 | farnesyl-diphosphate farnesyltransferase 1 | NA |
| 29916 | SNX11 | This gene encodes a member of the sorting nexin family. Members of this family contain a phox (PX) domain, which is a phosphoinositide binding domain, and are involved in intracellular trafficking. This protein does not contain a coiled coil region, like some family members. This gene encodes a protein of unknown function. This gene results in two transcript variants differing in the 5’ UTR, but encoding the same protein. | ENSG00000002919 | sorting nexin 11 | NA |
| ENSG00000236326 | RP3-486I3.5 | NA | ENSG00000236326 | NA | NA |
| 85441 | HELZ2 | The protein encoded by this gene is a nuclear transcriptional co-activator for peroxisome proliferator activated receptor alpha. The encoded protein contains a zinc finger and is a helicase that appears to be part of the peroxisome proliferator activated receptor alpha interacting complex. This gene is a member of the DNA2/NAM7 helicase gene family. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ENSG00000130589 | helicase with zinc finger 2 | NA |
| 121441 | NEDD1 | NA | ENSG00000139350 | neural precursor cell expressed, developmentally down-regulated 1 | NA |
| 51478 | HSD17B7 | HSD17B7 encodes an enzyme that functions both as a 17-beta-hydroxysteroid dehydrogenase (EC 1.1.1.62) in the biosynthesis of sex steroids and as a 3-ketosteroid reductase (EC 1.1.1.270) in the biosynthesis of cholesterol (Marijanovic et al., 2003 [PubMed 12829805]). | ENSG00000132196 | hydroxysteroid 17-beta dehydrogenase 7 | NA |
| 5229 | PGGT1B | Protein geranylgeranyltransferase type I (GGTase-I) transfers a geranylgeranyl group to the cysteine residue of candidate proteins containing a C-terminal CAAX motif in which ‘A’ is an aliphatic amino acid and ‘X’ is leucine (summarized by Zhang et al., 1994 [PubMed 8106351]). The enzyme is composed of a 48-kD alpha subunit (FNTA; MIM 134635) and a 43-kD beta subunit, encoded by the PGGT1B gene. The FNTA gene encodes the alpha subunit for both GGTase-I and the related enzyme farnesyltransferase. | ENSG00000164219 | protein geranylgeranyltransferase type I subunit beta | NA |
| ENSG00000213621 | RPSAP54 | NA | ENSG00000213621 | ribosomal protein SA pseudogene 54 | NA |
| 2272 | FHIT | This gene, a member of the histidine triad gene family, encodes a diadenosine 5’,5’’’-P1,P3-triphosphate hydrolase involved in purine metabolism. The gene encompasses the common fragile site FRA3B on chromosome 3, where carcinogen-induced damage can lead to translocations and aberrant transcripts of this gene. In fact, aberrant transcripts from this gene have been found in about half of all esophageal, stomach, and colon carcinomas. Alternatively spliced transcript variants have been found for this gene. | ENSG00000189283 | fragile histidine triad | NA |
| 23306 | NEMP1 | NA | ENSG00000166881 | nuclear envelope integral membrane protein 1 | NA |
| 4528 | MTIF2 | During the initiation of protein biosynthesis, initiation factor-2 (IF-2) promotes the binding of the initiator tRNA to the small subunit of the ribosome in a GTP-dependent manner. Prokaryotic IF-2 is a single polypeptide, while eukaryotic cytoplasmic IF-2 (eIF-2) is a trimeric protein. Bovine liver mitochondria contain IF-2(mt), an 85-kD monomeric protein that is equivalent to prokaryotic IF-2. The predicted 727-amino acid human protein contains a 29-amino acid presequence. Human IF-2(mt) shares 32 to 38% amino acid sequence identity with yeast IF-2(mt) and several prokaryotic IF-2s, with the greatest degree of conservation in the G domains of the proteins. | ENSG00000085760 | mitochondrial translational initiation factor 2 | NA |
| ENSG00000229931 | RP1-151F17.1 | NA | ENSG00000229931 | NA | NA |
| 79603 | CERS4 | NA | ENSG00000090661 | ceramide synthase 4 | NA |
| 5096 | PCCB | The protein encoded by this gene is a subunit of the propionyl-CoA carboxylase (PCC) enzyme, which is involved in the catabolism of propionyl-CoA. PCC is a mitochondrial enzyme that probably acts as a dodecamer of six alpha subunits and six beta subunits. This gene encodes the beta subunit of PCC. Defects in this gene are a cause of propionic acidemia type II (PA-2). Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000114054 | propionyl-CoA carboxylase beta subunit | NA |
| 51527 | GSKIP | This gene encodes a protein that is involved as a negative regulator of GSK3-beta in the Wnt signaling pathway. The encoded protein may play a role in the retinoic acid signaling pathway by regulating the functional interactions between GSK3-beta, beta-catenin and cyclin D1, and it regulates the beta-catenin/N-cadherin pool. The encoded protein contains a GSK3-beta interacting domain (GID) in its C-terminus, which is similar to the GID of Axin. The protein also contains an evolutionarily conserved RII-binding domain, which facilitates binding with protein kinase-A and GSK3-beta, enabling its role as an A-kinase anchoring protein. Alternatively spliced transcript variants have been observed for this gene. | ENSG00000100744 | GSK3B interacting protein | NA |
| 79980 | DSN1 | This gene encodes a kinetochore protein that functions as part of the minichromosome instability-12 centromere complex. The encoded protein is required for proper kinetochore assembly and progression through the cell cycle. Alternative splicing results in multiple transcript variants. | ENSG00000149636 | DSN1 homolog, MIS12 kinetochore complex component | NA |
| 5636 | PRPSAP2 | This gene encodes a protein that associates with the enzyme phosphoribosylpyrophosphate synthetase (PRS). PRS catalyzes the formation of phosphoribosylpyrophosphate which is a substrate for synthesis of purine and pyrimidine nucleotides, histidine, tryptophan and NAD. PRS exists as a complex with two catalytic subunits and two associated subunits. This gene encodes a non-catalytic associated subunit of PRS. Alternate splicing results in multiple transcript variants. | ENSG00000141127 | phosphoribosyl pyrophosphate synthetase associated protein 2 | NA |
| 51251 | NT5C3A | This gene encodes a member of the 5’-nucleotidase family of enzymes that catalyze the dephosphorylation of nucleoside 5’-monophosphates. The encoded protein is the type 1 isozyme of pyrimidine 5’ nucleotidase and catalyzes the dephosphorylation of pyrimidine 5’ monophosphates. Mutations in this gene are a cause of hemolytic anemia due to uridine 5-prime monophosphate hydrolase deficiency. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene, and pseudogenes of this gene are located on the long arm of chromosomes 3 and 4. | ENSG00000122643 | 5’-nucleotidase, cytosolic IIIA | NA |
| NA | NA | NA | ENSG00000233137 | NA | TRUE |
| 51106 | TFB1M | The protein encoded by this gene is a dimethyltransferase that methylates the conserved stem loop of mitochondrial 12S rRNA. The encoded protein also is part of the basal mitochondrial transcription complex and is necessary for mitochondrial gene expression. The methylation and transcriptional activities of this protein are independent of one another. Variations in this gene may influence the severity of aminoglycoside-induced deafness (AID). | ENSG00000029639 | transcription factor B1, mitochondrial | NA |
| 26148 | C10orf12 | NA | ENSG00000155640 | chromosome 10 open reading frame 12 | NA |
| ENSG00000216895 | AC009403.2 | NA | ENSG00000216895 | NA | NA |
| 283643 | C14orf80 | NA | ENSG00000185347 | chromosome 14 open reading frame 80 | NA |
| 57001 | SDHAF3 | NA | ENSG00000196636 | succinate dehydrogenase complex assembly factor 3 | NA |
| ENSG00000261684 | RP11-265N6.1 | NA | ENSG00000261684 | NA | NA |
| 339745 | SPOPL | NA | ENSG00000144228 | speckle type BTB/POZ protein like | NA |
| 8717 | TRADD | The protein encoded by this gene is a death domain containing adaptor molecule that interacts with TNFRSF1A/TNFR1 and mediates programmed cell death signaling and NF-kappaB activation. This protein binds adaptor protein TRAF2, reduces the recruitment of inhibitor-of-apoptosis proteins (IAPs) by TRAF2, and thus suppresses TRAF2 mediated apoptosis. This protein can also interact with receptor TNFRSF6/FAS and adaptor protein FADD/MORT1, and is involved in the Fas-induced cell death pathway. | ENSG00000102871 | TNFRSF1A associated via death domain | NA |
| NA | NA | NA | ENSG00000129282 | NA | TRUE |
| 150962 | PUS10 | Pseudouridination, the isomerization of uridine to pseudouridine, is the most common posttranscriptional nucleotide modification found in RNA and is essential for biologic functions such as spliceosome biogenesis. Pseudouridylate synthases, such as PUS10, catalyze pseudouridination of structural RNAs, including transfer, ribosomal, and splicing RNAs. These enzymes also act as RNA chaperones, facilitating the correct folding and assembly of tRNAs (McCleverty et al., 2007 [PubMed 17900615]). | ENSG00000162927 | pseudouridylate synthase 10 | NA |
| 10362 | HMG20B | NA | ENSG00000064961 | high mobility group 20B | NA |
| 195828 | ZNF367 | NA | ENSG00000165244 | zinc finger protein 367 | NA |
| 100131187 | TSTD1 | NA | ENSG00000215845 | thiosulfate sulfurtransferase like domain containing 1 | NA |
| 168374 | ZNF92 | NA | ENSG00000146757 | zinc finger protein 92 | NA |
| 55157 | DARS2 | The protein encoded by this gene belongs to the class-II aminoacyl-tRNA synthetase family. It is a mitochondrial enzyme that specifically aminoacylates aspartyl-tRNA. Mutations in this gene are associated with leukoencephalopathy with brainstem and spinal cord involvement and lactate elevation (LBSL). | ENSG00000117593 | aspartyl-tRNA synthetase 2, mitochondrial | NA |
| ENSG00000223551 | TMSB4XP4 | NA | ENSG00000223551 | thymosin beta 4, X-linked pseudogene 4 | NA |
| 4507 | MTAP | This gene encodes an enzyme that plays a major role in polyamine metabolism and is important for the salvage of both adenine and methionine. The encoded enzyme is deficient in many cancers because this gene and the tumor suppressor p16 gene are co-deleted. Multiple alternatively spliced transcript variants have been described for this gene, but their full-length natures remain unknown. | ENSG00000099810 | methylthioadenosine phosphorylase | NA |
| ENSG00000212789 | ST13P5 | NA | ENSG00000212789 | suppression of tumorigenicity 13 (colon carcinoma) (Hsp70 interacting protein) pseudogene 5 | NA |
| 55732 | C1orf112 | NA | ENSG00000000460 | chromosome 1 open reading frame 112 | NA |
| ENSG00000182165 | TP53TG1 | NA | ENSG00000182165 | TP53 target 1 (non-protein coding) | NA |
| 3455 | IFNAR2 | The protein encoded by this gene is a type I membrane protein that forms one of the two chains of a receptor for interferons alpha and beta. Binding and activation of the receptor stimulates Janus protein kinases, which in turn phosphorylate several proteins, including STAT1 and STAT2. Multiple transcript variants encoding at least two different isoforms have been found for this gene. | ENSG00000159110 | interferon alpha and beta receptor subunit 2 | NA |
| 644591 | PPIAL4G | NA | ENSG00000236334 | peptidylprolyl isomerase A like 4G | NA |
| 196743 | PAOX | NA | ENSG00000148832 | polyamine oxidase (exo-N4-amino) | NA |
| ENSG00000218175 | AC016739.2 | NA | ENSG00000218175 | NA | NA |
| 89891 | WDR34 | This gene encodes a member of the WD repeat protein family. WD repeats are minimally conserved regions of approximately 40 amino acids typically bracketed by gly-his and trp-asp (GH-WD), which may facilitate formation of heterotrimeric or multiprotein complexes. Members of this family are involved in a variety of cellular processes, including cell cycle progression, signal transduction, apoptosis, and gene regulation. Defects in this gene are a cause of short-rib thoracic dysplasia 11 with or without polydactyly. | ENSG00000119333 | WD repeat domain 34 | NA |
| 26275 | HIBCH | This gene encodes the enzyme responsible for hydrolysis of both HIBYL-CoA and beta-hydroxypropionyl-CoA. Mutations in this gene have been associated with 3-hyroxyisobutyryl-CoA hydrolase deficiency. Alternative splicing results in multiple transcript variants. | ENSG00000198130 | 3-hydroxyisobutyryl-CoA hydrolase | NA |
| 10519 | CIB1 | This gene encodes a member of the EF-hand domain-containing calcium-binding superfamily. The encoded protein interacts with many other proteins, including the platelet integrin alpha-IIb-beta-3, DNA-dependent protein kinase, presenilin-2, focal adhesion kinase, p21 activated kinase, and protein kinase D. The encoded protein may be involved in cell survival and proliferation, and is associated with several disease states including cancer and Alzheimer’s disease. Alternative splicing results in multiple transcript variants. | ENSG00000185043 | calcium and integrin binding 1 | NA |
| 3665 | IRF7 | IRF7 encodes interferon regulatory factor 7, a member of the interferon regulatory transcription factor (IRF) family. IRF7 has been shown to play a role in the transcriptional activation of virus-inducible cellular genes, including interferon beta chain genes. Inducible expression of IRF7 is largely restricted to lymphoid tissue. Multiple IRF7 transcript variants have been identified, although the functional consequences of these have not yet been established. | ENSG00000185507 | interferon regulatory factor 7 | NA |
| 25771 | TBC1D22A | NA | ENSG00000054611 | TBC1 domain family member 22A | NA |
| 339559 | ZFP69 | NA | ENSG00000187815 | ZFP69 zinc finger protein | NA |
| 4361 | MRE11A | This gene encodes a nuclear protein involved in homologous recombination, telomere length maintenance, and DNA double-strand break repair. By itself, the protein has 3’ to 5’ exonuclease activity and endonuclease activity. The protein forms a complex with the RAD50 homolog; this complex is required for nonhomologous joining of DNA ends and possesses increased single-stranded DNA endonuclease and 3’ to 5’ exonuclease activities. In conjunction with a DNA ligase, this protein promotes the joining of noncomplementary ends in vitro using short homologies near the ends of the DNA fragments. This gene has a pseudogene on chromosome 3. Alternative splicing of this gene results in two transcript variants encoding different isoforms. | ENSG00000020922 | MRE11 homolog A, double strand break repair nuclease | NA |
| 55734 | ZFP64 | NA | ENSG00000020256 | ZFP64 zinc finger protein | NA |
| 63979 | FIGNL1 | NA | ENSG00000132436 | fidgetin like 1 | NA |
| 900 | CCNG1 | The eukaryotic cell cycle is governed by cyclin-dependent protein kinases (CDKs) whose activities are regulated by cyclins and CDK inhibitors. The protein encoded by this gene is a member of the cyclin family and contains the cyclin box. The encoded protein lacks the protein destabilizing (PEST) sequence that is present in other family members. Transcriptional activation of this gene can be induced by tumor protein p53. Two transcript variants encoding the same protein have been identified for this gene. | ENSG00000113328 | cyclin G1 | NA |
| 55300 | PI4K2B | Phosphatidylinositol 4-kinases (PI4Ks) phosphorylate phosphatidylinositol to generate phosphatidylinositol 4-phosphate (PIP), an immediate precursor of several important signaling and scaffolding molecules. PIP itself may also have direct functional and structural roles. PI4K2B is a primarily cytosolic PI4K that is recruited to membranes, where it stimulates phosphatidylinositol 4,5-bisphosphate synthesis (Wei et al., 2002 [PubMed 12324459]). | ENSG00000038210 | phosphatidylinositol 4-kinase type 2 beta | NA |
| 25901 | CCDC28A | This gene is located in a region close to the locus of the pseudogene of chemokine (C-C motif) receptor-like 1 on chromosome 6. The specific function of this gene has not yet been determined. | ENSG00000024862 | coiled-coil domain containing 28A | NA |
| 201725 | C4orf46 | This gene encodes a small, conserved protein of unknown function that is expressed in a variety of tissues. There are pseudogenes for this gene on chromosomes 6, 8, 16, and X. Alternative splicing results in multiple transcript variants. | ENSG00000205208 | chromosome 4 open reading frame 46 | NA |
| 55170 | PRMT6 | The protein encoded by this gene belongs to the arginine N-methyltransferase family, which catalyze the sequential transfer of methyl group from S-adenosyl-L-methionine to the side chain nitrogens of arginine residues within proteins, to form methylated arginine derivatives and S-adenosyl-L-homocysteine. This protein can catalyze both, the formation of omega-N monomethylarginine and asymmetrical dimethylarginine, with a strong preference for the latter. It specifically mediates the asymmetric dimethylation of Arg2 of histone H3, and the methylated form represents a specific tag for epigenetic transcriptional repression. This protein also forms a complex with, and methylates DNA polymerase beta, resulting in stimulation of polymerase activity by enhancing DNA binding and processivity. | ENSG00000198890 | protein arginine methyltransferase 6 | NA |
| 57407 | NMRAL1 | This gene encodes an NADPH sensor protein that preferentially binds to NADPH. The encoded protein also negatively regulates the activity of NF-kappaB in a ubiquitylation-dependent manner. It plays a key role in cellular antiviral response by negatively regulating the interferon response factor 3-mediated expression of interferon beta. Alternative splicing of this gene results in multiple transcript variants. | ENSG00000153406 | NmrA-like family domain containing 1 | NA |
| 840 | CASP7 | This gene encodes a member of the cysteine-aspartic acid protease (caspase) family. Sequential activation of caspases plays a central role in the execution-phase of cell apoptosis. Caspases exist as inactive proenzymes which undergo proteolytic processing at conserved aspartic residues to produce two subunits, large and small, that dimerize to form the active enzyme. The precursor of the encoded protein is cleaved by caspase 3 and 10, is activated upon cell death stimuli and induces apoptosis. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | ENSG00000165806 | caspase 7 | NA |
| 55791 | LRIF1 | NA | ENSG00000121931 | ligand dependent nuclear receptor interacting factor 1 | NA |
| ENSG00000269749 | AC005614.5 | NA | ENSG00000269749 | NA | NA |
| 84191 | FAM96A | NA | ENSG00000166797 | family with sequence similarity 96 member A | NA |
| 3157 | HMGCS1 | NA | ENSG00000112972 | 3-hydroxy-3-methylglutaryl-CoA synthase 1 | NA |
| ENSG00000269534 | CTC-453G23.5 | NA | ENSG00000269534 | NA | NA |
| 79828 | METTL8 | NA | ENSG00000123600 | methyltransferase like 8 | NA |
| ENSG00000261438 | RP11-399O19.9 | NA | ENSG00000261438 | NA | NA |
| 79866 | BORA | BORA is an activator of the protein kinase Aurora A (AURKA; MIM 603072), which is required for centrosome maturation, spindle assembly, and asymmetric protein localization during mitosis (Hutterer et al., 2006 [PubMed 16890155]). | ENSG00000136122 | bora, aurora kinase A activator | NA |
| 129531 | MITD1 | Abscission, the separation of daughter cells at the end of cytokinesis, is effected by endosomal sorting complexes required for transport III (ESCRT-III). The protein encoded by this gene functions as a homodimer, with the N-termini binding to a subset of ESCRT-III subunits and the C-termini binding to membranes. The encoded protein regulates ESCRT-III activity and is required for proper cytokinesis. Several transcript variants encoding different isoforms have been found for this gene. | ENSG00000158411 | microtubule interacting and trafficking domain containing 1 | NA |
| 100506100 | LOC100506100 | NA | ENSG00000223478 | uncharacterized LOC100506100 | NA |
| 11235 | PDCD10 | This gene encodes an evolutionarily conserved protein associated with cell apoptosis. The protein interacts with the serine/threonine protein kinase MST4 to modulate the extracellular signal-regulated kinase (ERK) pathway. It also interacts with and is phosphoryated by serine/threonine kinase 25, and is thought to function in a signaling pathway essential for vascular developent. Mutations in this gene are one cause of cerebral cavernous malformations, which are vascular malformations that cause seizures and cerebral hemorrhages. Multiple alternatively spliced variants, encoding the same protein, have been identified. | ENSG00000114209 | programmed cell death 10 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",14,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[15,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| query | symbol | X_id | summary | name | notfound |
|---|---|---|---|---|---|
| ENSG00000197249 | SERPINA1 | 5265 | The protein encoded by this gene is secreted and is a serine protease inhibitor whose targets include elastase, plasmin, thrombin, trypsin, chymotrypsin, and plasminogen activator. Defects in this gene can cause emphysema or liver disease. Several transcript variants encoding the same protein have been found for this gene. | serpin family A member 1 | NA |
| ENSG00000254814 | RP11-535A19.1 | ENSG00000254814 | NA | NA | NA |
| ENSG00000225075 | RP11-426L16.3 | ENSG00000225075 | NA | NA | NA |
| ENSG00000161692 | DBF4B | 80174 | This gene encodes a regulator of the cell division cycle 7 homolog (S. cerevisiae) protein, a serine-threonine kinase which links cell cycle regulation to genome duplication. This protein localizes to the nucleus and, in complex with the cell division cycle 7 homolog (S. cerevisiae) protein, may facilitate M phase progression. Alternative splicing results in multiple transcript variants. | DBF4 zinc finger B | NA |
| ENSG00000240494 | RPS12P28 | ENSG00000240494 | NA | ribosomal protein S12 pseudogene 28 | NA |
| ENSG00000272146 | ARF4-AS1 | 106144532 | NA | ARF4 antisense RNA 1 | NA |
| ENSG00000184635 | ZNF93 | 81931 | NA | zinc finger protein 93 | NA |
| ENSG00000236618 | PITPNA-AS1 | 100306951 | NA | PITPNA antisense RNA 1 | NA |
| ENSG00000141570 | CBX8 | 57332 | NA | chromobox 8 | NA |
| ENSG00000138166 | DUSP5 | 1847 | The protein encoded by this gene is a member of the dual specificity protein phosphatase subfamily. These phosphatases inactivate their target kinases by dephosphorylating both the phosphoserine/threonine and phosphotyrosine residues. They negatively regulate members of the mitogen-activated protein (MAP) kinase superfamily (MAPK/ERK, SAPK/JNK, p38), which are associated with cellular proliferation and differentiation. Different members of the family of dual specificity phosphatases show distinct substrate specificities for various MAP kinases, different tissue distribution and subcellular localization, and different modes of inducibility of their expression by extracellular stimuli. This gene product inactivates ERK1, is expressed in a variety of tissues with the highest levels in pancreas and brain, and is localized in the nucleus. | dual specificity phosphatase 5 | NA |
| ENSG00000167920 | TMEM99 | 147184 | NA | transmembrane protein 99 | NA |
| ENSG00000103995 | CEP152 | 22995 | This gene encodes a protein that is thought to be involved with centrosome function. Mutations in this gene have been associated with primary microcephaly (MCPH4). Alternative splicing results in multiple transcript variants. | centrosomal protein 152 | NA |
| ENSG00000213988 | ZNF90 | 7643 | NA | zinc finger protein 90 | NA |
| ENSG00000248932 | LOC100507291 | 100507291 | NA | uncharacterized LOC100507291 | NA |
| ENSG00000253540 | FAM86HP | ENSG00000253540 | NA | family with sequence similarity 86 member H, pseudogene | NA |
| ENSG00000118162 | KPTN | 11133 | This gene encodes a filamentous-actin-associated protein, which is involved in actin dynamics and plays an important role in neuromorphogenesis. Mutations in this gene result in recessive mental retardation-41. Alternatively spliced transcript variants have been found for this gene. | kaptin (actin binding protein) | NA |
| ENSG00000177051 | FBXO46 | 23403 | Members of the F-box protein family, such as FBXO46, are characterized by an approximately 40-amino acid F-box motif. SCF complexes, formed by SKP1 (MIM 601434), cullin (see CUL1; MIM 603134), and F-box proteins, act as protein-ubiquitin ligases. F-box proteins interact with SKP1 through the F box, and they interact with ubiquitination targets through other protein interaction domains (Jin et al., 2004 [PubMed 15520277]). | F-box protein 46 | NA |
| ENSG00000101405 | OXT | 5020 | This gene encodes a precursor protein that is processed to produce oxytocin and neurophysin I. Oxytocin is a posterior pituitary hormone which is synthesized as an inactive precursor in the hypothalamus along with its carrier protein neurophysin I. Together with neurophysin, it is packaged into neurosecretory vesicles and transported axonally to the nerve endings in the neurohypophysis, where it is either stored or secreted into the bloodstream. The precursor seems to be activated while it is being transported along the axon to the posterior pituitary. This hormone contracts smooth muscle during parturition and lactation. It is also involved in cognition, tolerance, adaptation and complex sexual and maternal behaviour, as well as in the regulation of water excretion and cardiovascular functions. | oxytocin/neurophysin I prepropeptide | NA |
| ENSG00000270673 | YTHDF3-AS1 | 101410533 | NA | YTHDF3 antisense RNA 1 (head to head) | NA |
| ENSG00000266783 | RP11-715F3.2 | ENSG00000266783 | NA | NA | NA |
| ENSG00000117586 | TNFSF4 | 7292 | This gene encodes a cytokine of the tumor necrosis factor (TNF) ligand family. The encoded protein functions in T cell antigen-presenting cell (APC) interactions and mediates adhesion of activated T cells to endothelial cells. Polymorphisms in this gene have been associated with Sjogren’s syndrome and systemic lupus erythematosus. Alternative splicing results in multiple transcript variants. | tumor necrosis factor superfamily member 4 | NA |
| ENSG00000272667 | RP11-395A13.2 | ENSG00000272667 | NA | NA | NA |
| ENSG00000167543 | TP53I13 | 90313 | NA | tumor protein p53 inducible protein 13 | NA |
| ENSG00000267030 | CTB-50L17.7 | ENSG00000267030 | NA | NA | NA |
| ENSG00000170430 | MGMT | 4255 | Alkylating agents are potent carcinogens that can result in cell death, mutation and cancer. The protein encoded by this gene is a DNA repair protein that is involved in cellular defense against mutagenesis and toxicity from alkylating agents. The protein catalyzes transfer of methyl groups from O(6)-alkylguanine and other methylated moieties of the DNA to its own molecule, which repairs the toxic lesions. Methylation of the genes promoter has been associated with several cancer types, including colorectal cancer, lung cancer, lymphoma and glioblastoma. | O-6-methylguanine-DNA methyltransferase | NA |
| ENSG00000196476 | C20orf96 | 140680 | NA | chromosome 20 open reading frame 96 | NA |
| ENSG00000241073 | RP4-714D9.2 | ENSG00000241073 | NA | NA | NA |
| ENSG00000166965 | RCCD1 | 91433 | NA | RCC1 domain containing 1 | NA |
| ENSG00000237015 | CTA-984G1.5 | ENSG00000237015 | NA | NA | NA |
| ENSG00000213443 | RP11-75L1.2 | ENSG00000213443 | NA | NA | NA |
| ENSG00000232677 | LINC00665 | 100506930 | NA | long intergenic non-protein coding RNA 665 | NA |
| ENSG00000259583 | RP11-66B24.4 | ENSG00000259583 | NA | NA | NA |
| ENSG00000062282 | DGAT2 | 84649 | This gene encodes one of two enzymes which catalyzes the final reaction in the synthesis of triglycerides in which diacylglycerol is covalently bound to long chain fatty acyl-CoAs. The encoded protein catalyzes this reaction at low concentrations of magnesium chloride while the other enzyme has high activity at high concentrations of magnesium chloride. Multiple transcript variants encoding different isoforms have been found for this gene. | diacylglycerol O-acyltransferase 2 | NA |
| ENSG00000126453 | BCL2L12 | 83596 | This gene encodes a member of a family of proteins containing a Bcl-2 homology domain 2 (BH2). The encoded protein is an anti-apoptotic factor that acts as an inhibitor of caspases 3 and 7 in the cytoplasm. In the nucleus, it binds to the p53 tumor suppressor protein, preventing its association with target genes. Overexpression of this gene has been detected in a number of different cancers. There is a pseudogene for this gene on chromosome 3. Alternative splicing results in multiple transcript variants. | BCL2 like 12 | NA |
| ENSG00000106268 | NUDT1 | 4521 | Misincorporation of oxidized nucleoside triphosphates into DNA/RNA during replication and transcription can cause mutations that may result in carcinogenesis or neurodegeneration. The protein encoded by this gene is an enzyme that hydrolyzes oxidized purine nucleoside triphosphates, such as 8-oxo-dGTP, 8-oxo-dATP, 2-hydroxy-dATP, and 2-hydroxy rATP, to monophosphates, thereby preventing misincorporation. The encoded protein is localized mainly in the cytoplasm, with some in the mitochondria, suggesting that it is involved in the sanitization of nucleotide pools both for nuclear and mitochondrial genomes. Several alternatively spliced transcript variants, some of which encode distinct isoforms, have been identified. Additional variants have been observed, but their full-length natures have not been determined. A single-nucleotide polymorphism that results in the production of an additional, longer isoform (p26) has been described. | nudix hydrolase 1 | NA |
| ENSG00000155363 | MOV10 | 4343 | NA | Mov10 RISC complex RNA helicase | NA |
| ENSG00000105750 | ZNF85 | 7639 | NA | zinc finger protein 85 | NA |
| ENSG00000110011 | DNAJC4 | 3338 | NA | DnaJ heat shock protein family (Hsp40) member C4 | NA |
| ENSG00000144031 | ANKRD53 | 79998 | NA | ankyrin repeat domain 53 | NA |
| ENSG00000260018 | RP11-505K9.1 | ENSG00000260018 | NA | NA | NA |
| ENSG00000261779 | RP11-69H7.3 | ENSG00000261779 | NA | NA | NA |
| ENSG00000079462 | PAFAH1B3 | 5050 | This gene encodes an acetylhydrolase that catalyzes the removal of an acetyl group from the glycerol backbone of platelet-activating factor. The encoded enzyme is a subunit of the platelet-activating factor acetylhydrolase isoform 1B complex, which consists of the catalytic beta and gamma subunits and the regulatory alpha subunit. This complex functions in brain development. A translocation between this gene on chromosome 19 and the CDC-like kinase 2 gene on chromosome 1 has been observed, and was associated with mental retardation, ataxia, and atrophy of the brain. Alternatively spliced transcript variants have been described. | platelet activating factor acetylhydrolase 1b catalytic subunit 3 | NA |
| ENSG00000146066 | HIGD2A | 192286 | NA | HIG1 hypoxia inducible domain family member 2A | NA |
| ENSG00000076248 | UNG | 7374 | This gene encodes one of several uracil-DNA glycosylases. One important function of uracil-DNA glycosylases is to prevent mutagenesis by eliminating uracil from DNA molecules by cleaving the N-glycosylic bond and initiating the base-excision repair (BER) pathway. Uracil bases occur from cytosine deamination or misincorporation of dUMP residues. Alternative promoter usage and splicing of this gene leads to two different isoforms: the mitochondrial UNG1 and the nuclear UNG2. The UNG2 term was used as a previous symbol for the CCNO gene (GeneID 10309), which has been confused with this gene, in the literature and some databases. | uracil DNA glycosylase | NA |
| ENSG00000175854 | SWI5 | 375757 | NA | SWI5 homologous recombination repair protein | NA |
| ENSG00000114735 | HEMK1 | 51409 | NA | HemK methyltransferase family member 1 | NA |
| ENSG00000260136 | CTD-2270L9.4 | ENSG00000260136 | NA | NA | NA |
| ENSG00000184465 | WDR27 | 253769 | This gene encodes a protein with multiple WD repeats. Proteins with these repeats may form scaffolds for protein-protein interaction and play key roles in cell signalling. Alternative splicing results in multiple transcript variants, but the full-length structure of some of these variants cannot be determined. | WD repeat domain 27 | NA |
| ENSG00000260517 | RP11-426C22.5 | ENSG00000260517 | NA | NA | NA |
| ENSG00000221909 | FAM200A | 221786 | This gene encodes a protein of unknown function. The protein is weakly similar to transposase-like proteins in human and mouse. | family with sequence similarity 200 member A | NA |
| ENSG00000236778 | INTS6-AS1 | ENSG00000236778 | NA | INTS6 antisense RNA 1 | NA |
| ENSG00000166896 | ATP23 | 91419 | The protein encoded by this gene is amplified in glioblastomas and interacts with the DNA binding subunit of DNA-dependent protein kinase. This kinase is involved in double-strand break repair (DSB), and higher expression of the encoded protein increases the efficiency of DSB. In addition, comparison to orthologous proteins strongly suggests that this protein is a metalloprotease important in the biosynthesis of mitochondrial ATPase. Several transcript variants encoding different isoforms have been found for this gene. | ATP23 metallopeptidase and ATP synthase assembly factor homolog (S. cerevisiae) | NA |
| ENSG00000236015 | AC011290.5 | ENSG00000236015 | NA | NA | NA |
| ENSG00000104983 | CCDC61 | 729440 | NA | coiled-coil domain containing 61 | NA |
| ENSG00000160318 | CLDND2 | 125875 | NA | claudin domain containing 2 | NA |
| ENSG00000224420 | ADM5 | 199800 | NA | adrenomedullin 5 (putative) | NA |
| ENSG00000267105 | CTD-2369P2.4 | ENSG00000267105 | NA | NA | NA |
| ENSG00000253210 | RP11-809O17.1 | ENSG00000253210 | NA | NA | NA |
| ENSG00000186665 | C17orf58 | 284018 | NA | chromosome 17 open reading frame 58 | NA |
| ENSG00000108107 | RPL28 | 6158 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L28E family of ribosomal proteins. It is located in the cytoplasm. Variable expression of this gene in colorectal cancers compared to adjacent normal tissues has been observed, although no correlation between the level of expression and the severity of the disease has been found. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | ribosomal protein L28 | NA |
| ENSG00000224261 | RPSAP18 | ENSG00000224261 | NA | ribosomal protein SA pseudogene 18 | NA |
| ENSG00000137404 | NRM | 11270 | The protein encoded by this gene contains transmembrane domains and resides within the inner nuclear membrane, where it is tightly associated with the nucleus. This protein shares homology with isoprenylcysteine carboxymethyltransferase enzymes. Alternative splicing results in multiple transcript variants that encode different protein isoforms. | nurim (nuclear envelope membrane protein) | NA |
| ENSG00000221829 | FANCG | 2189 | The Fanconi anemia complementation group (FANC) currently includes FANCA, FANCB, FANCC, FANCD1 (also called BRCA2), FANCD2, FANCE, FANCF, FANCG, FANCI, FANCJ (also called BRIP1), FANCL, FANCM and FANCN (also called PALB2). The previously defined group FANCH is the same as FANCA. Fanconi anemia is a genetically heterogeneous recessive disorder characterized by cytogenetic instability, hypersensitivity to DNA crosslinking agents, increased chromosomal breakage, and defective DNA repair. The members of the Fanconi anemia complementation group do not share sequence similarity; they are related by their assembly into a common nuclear protein complex. This gene encodes the protein for complementation group G. | Fanconi anemia complementation group G | NA |
| ENSG00000269560 | CTD-2192J16.21 | ENSG00000269560 | NA | NA | NA |
| ENSG00000152147 | GEMIN6 | 79833 | GEMIN6 is part of a large macromolecular complex, localized to both the cytoplasm and the nucleus, that plays a role in the cytoplasmic assembly of small nuclear ribonucleoproteins (snRNPs). Other members of this complex include SMN (MIM 600354), GEMIN2 (SIP1; MIM 602595), GEMIN3 (DDX20; MIM 606168), GEMIN4 (MIM 606969), and GEMIN5 (MIM 607005). | gem nuclear organelle associated protein 6 | NA |
| ENSG00000243414 | TICAM2 | 353376 | TIRP is a Toll/interleukin-1 receptor (IL1R; MIM 147810) (TIR) domain-containing adaptor protein involved in Toll receptor signaling (see TLR4; MIM 603030). | toll like receptor adaptor molecule 2 | NA |
| ENSG00000232995 | RGS5 | 8490 | This gene encodes a member of the regulators of G protein signaling (RGS) family. The RGS proteins are signal transduction molecules which are involved in the regulation of heterotrimeric G proteins by acting as GTPase activators. This gene is a hypoxia-inducible factor-1 dependent, hypoxia-induced gene which is involved in the induction of endothelial apoptosis. This gene is also one of three genes on chromosome 1q contributing to elevated blood pressure. Alternatively spliced transcript variants have been identified. | regulator of G-protein signaling 5 | NA |
| ENSG00000224066 | RP4-622L5.7 | ENSG00000224066 | NA | NA | NA |
| ENSG00000268516 | CTD-3138B18.5 | ENSG00000268516 | NA | NA | NA |
| ENSG00000187187 | ZNF546 | 339327 | NA | zinc finger protein 546 | NA |
| ENSG00000229539 | RP11-119B16.2 | ENSG00000229539 | NA | NA | NA |
| ENSG00000233469 | ST6GALNAC4P1 | ENSG00000233469 | NA | ST6 (alpha-N-acetyl-neuraminyl-2,3-beta-galactosyl-1,3)-N-acetylgalactosaminide alpha-2,6-sialyltransferase 4 pseudogene 1 | NA |
| ENSG00000273116 | NA | NA | NA | NA | TRUE |
| ENSG00000148814 | LRRC27 | 80313 | NA | leucine rich repeat containing 27 | NA |
| ENSG00000205208 | C4orf46 | 201725 | This gene encodes a small, conserved protein of unknown function that is expressed in a variety of tissues. There are pseudogenes for this gene on chromosomes 6, 8, 16, and X. Alternative splicing results in multiple transcript variants. | chromosome 4 open reading frame 46 | NA |
| ENSG00000205464 | ATP6AP1L | 92270 | NA | ATPase H+ transporting accessory protein 1 like | NA |
| ENSG00000267575 | LOC101927151 | 101927151 | NA | uncharacterized LOC101927151 | NA |
| ENSG00000224956 | NA | NA | NA | NA | TRUE |
| ENSG00000197568 | HHLA3 | 11147 | NA | HERV-H LTR-associating 3 | NA |
| ENSG00000267751 | AC009005.2 | ENSG00000267751 | NA | NA | NA |
| ENSG00000083814 | ZNF671 | 79891 | NA | zinc finger protein 671 | NA |
| ENSG00000188243 | COMMD6 | 170622 | COMMD6 belongs to a family of NF-kappa-B (see RELA; MIM 164014)-inhibiting proteins characterized by the presence of a COMM domain (see COMMD1; MIM 607238) (de Bie et al., 2006 [PubMed 16573520]). | COMM domain containing 6 | NA |
| ENSG00000204519 | ZNF551 | 90233 | NA | zinc finger protein 551 | NA |
| ENSG00000164241 | C5orf63 | 401207 | NA | chromosome 5 open reading frame 63 | NA |
| ENSG00000245614 | DDX11-AS1 | 100506660 | NA | DDX11 antisense RNA 1 | NA |
| ENSG00000134253 | TRIM45 | 80263 | This gene encodes a member of the tripartite motif family. The encoded protein may function as a transcriptional repressor of the mitogen-activated protein kinase pathway. Alternatively spliced transcript variants have been described. | tripartite motif containing 45 | NA |
| ENSG00000188878 | FBF1 | ENSG00000188878 | NA | Fas (TNFRSF6) binding factor 1 | NA |
| ENSG00000138399 | FASTKD1 | 79675 | NA | FAST kinase domains 1 | NA |
| ENSG00000204410 | MSH5 | 4439 | This gene encodes a member of the mutS family of proteins that are involved in DNA mismatch repair and meiotic recombination. This protein is similar to a Saccharomyces cerevisiae protein that participates in segregation fidelity and crossing-over events during meiosis. This protein plays a role in promoting ionizing radiation-induced apoptosis. This protein forms hetero-oligomers with another member of this family, mutS homolog 4. Polymorphisms in this gene have been linked to various human diseases, including IgA deficiency, common variable immunodeficiency, and premature ovarian failure. Alternative splicing results multiple transcript variants. Read-through transcription also exists between this gene and the downstream chromosome 6 open reading frame 26 (C6orf26) gene. | mutS homolog 5 | NA |
| ENSG00000245261 | RP3-330M21.5 | ENSG00000245261 | NA | NA | NA |
| ENSG00000171163 | ZNF692 | 55657 | NA | zinc finger protein 692 | NA |
| ENSG00000169964 | TMEM42 | 131616 | NA | transmembrane protein 42 | NA |
| ENSG00000131378 | RFTN1 | 23180 | NA | raftlin, lipid raft linker 1 | NA |
| ENSG00000106477 | CEP41 | 95681 | This gene encodes a centrosomal and microtubule-binding protein which is predicted to have two coiled-coil domains and a rhodanese domain. In human retinal pigment epithelial cells the protein localized to centrioles and cilia. Mutations in this gene have been associated with Joubert Syndrome 15; an autosomal recessive ciliopathy and neurological disorder. Alternative splicing results in multiple transcript variants. | centrosomal protein 41 | NA |
| ENSG00000156787 | TBC1D31 | 93594 | NA | TBC1 domain family member 31 | NA |
| ENSG00000259901 | NA | NA | NA | NA | TRUE |
| ENSG00000233184 | RP11-421L21.3 | ENSG00000233184 | NA | NA | NA |
| ENSG00000105173 | CCNE1 | 898 | The protein encoded by this gene belongs to the highly conserved cyclin family, whose members are characterized by a dramatic periodicity in protein abundance through the cell cycle. Cyclins function as regulators of CDK kinases. Different cyclins exhibit distinct expression and degradation patterns which contribute to the temporal coordination of each mitotic event. This cyclin forms a complex with and functions as a regulatory subunit of CDK2, whose activity is required for cell cycle G1/S transition. This protein accumulates at the G1-S phase boundary and is degraded as cells progress through S phase. Overexpression of this gene has been observed in many tumors, which results in chromosome instability, and thus may contribute to tumorigenesis. This protein was found to associate with, and be involved in, the phosphorylation of NPAT protein (nuclear protein mapped to the ATM locus), which participates in cell-cycle regulated histone gene expression and plays a critical role in promoting cell-cycle progression in the absence of pRB. | cyclin E1 | NA |
| ENSG00000260368 | RP11-521I2.3 | ENSG00000260368 | NA | NA | NA |
| ENSG00000167081 | PBX3 | 5090 | NA | PBX homeobox 3 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",15,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[16,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| summary | query | name | X_id | symbol | notfound |
|---|---|---|---|---|---|
| Proteins encoded by the complexin/synaphin gene family are cytosolic proteins that function in synaptic vesicle exocytosis. These proteins bind syntaxin, part of the SNAP receptor. The protein product of this gene binds to the SNAP receptor complex and disrupts it, allowing transmitter release. | ENSG00000168993 | complexin 1 | 10815 | CPLX1 | NA |
| The protein encoded by this gene is involved in the attachment of osteoclasts to the mineralized bone matrix. The encoded protein is secreted and binds hydroxyapatite with high affinity. The osteoclast vitronectin receptor is found in the cell membrane and may be involved in the binding to this protein. This protein is also a cytokine that upregulates expression of interferon-gamma and interleukin-12. Several transcript variants encoding different isoforms have been found for this gene. | ENSG00000118785 | secreted phosphoprotein 1 | 6696 | SPP1 | NA |
| This gene encodes a member of the paralemmin protein family. The product of this gene is a prenylated and palmitoylated phosphoprotein that associates with the cytoplasmic face of plasma membranes and is implicated in plasma membrane dynamics in neurons and other cell types. Several alternatively spliced transcript variants have been identified, but the full-length nature of only two transcript variants has been determined. | ENSG00000099864 | paralemmin | 5064 | PALM | NA |
| The protein encoded by this gene may play a role in the attachment of stem cells to the bone marrow extracellular matrix or to stromal cells. This single-pass membrane protein is highly glycosylated and phosphorylated by protein kinase C. Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000174059 | CD34 molecule | 947 | CD34 | NA |
| This gene belongs to the family of reticulon encoding genes. Reticulons are associated with the endoplasmic reticulum, and are involved in neuroendocrine secretion or in membrane trafficking in neuroendocrine cells. This gene is considered to be a specific marker for neurological diseases and cancer, and is a potential molecular target for therapy. Alternative splicing results in multiple transcript variants. | ENSG00000139970 | reticulon 1 | 6252 | RTN1 | NA |
| The protein encoded by this gene is similar to insulin in function and structure and is a member of a family of proteins involved in mediating growth and development. The encoded protein is processed from a precursor, bound by a specific receptor, and secreted. Defects in this gene are a cause of insulin-like growth factor I deficiency. Alternative splicing results in multiple transcript variants encoding different isoforms that may undergo similar processing to generate mature protein. | ENSG00000017427 | insulin like growth factor 1 | 3479 | IGF1 | NA |
| This gene encodes a protein with an arfaptin homology domain that is found both in the cytosol and as membrane-bound form on the Golgi complex and immature secretory granules. This protein is believed to be an autoantigen in insulin-dependent diabetes mellitus and primary Sjogren’s syndrome. Several transcript variants encoding two different isoforms have been found for this gene. | ENSG00000003147 | islet cell autoantigen 1 | 3382 | ICA1 | NA |
| cAMP is a signaling molecule important for a variety of cellular functions. cAMP exerts its effects by activating the cAMP-dependent protein kinase, which transduces the signal through phosphorylation of different target proteins. The inactive kinase holoenzyme is a tetramer composed of two regulatory and two catalytic subunits. cAMP causes the dissociation of the inactive holoenzyme into a dimer of regulatory subunits bound to four cAMP and two free monomeric catalytic subunits. Four different regulatory subunits and three catalytic subunits have been identified in humans. The protein encoded by this gene is one of the regulatory subunits. This subunit can be phosphorylated by the activated catalytic subunit. This subunit has been shown to interact with and suppress the transcriptional activity of the cAMP responsive element binding protein 1 (CREB1) in activated T cells. Knockout studies in mice suggest that this subunit may play an important role in regulating energy balance and adiposity. The studies also suggest that this subunit may mediate the gene induction and cataleptic behavior induced by haloperidol. | ENSG00000005249 | protein kinase cAMP-dependent type II regulatory subunit beta | 5577 | PRKAR2B | NA |
| This gene encodes a multi-domain secreted protein that may have a critical role in ocular and limb development. Mutations in this gene are associated with microphthalmia and limb anomalies. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ENSG00000198732 | SPARC related modular calcium binding 1 | 64093 | SMOC1 | NA |
| The protein encoded by this gene is a member of the RAMP family of single-transmembrane-domain proteins, called receptor (calcitonin) activity modifying proteins (RAMPs). RAMPs are type I transmembrane proteins with an extracellular N terminus and a cytoplasmic C terminus. RAMPs are required to transport calcitonin-receptor-like receptor (CRLR) to the plasma membrane. CRLR, a receptor with seven transmembrane domains, can function as either a calcitonin-gene-related peptide (CGRP) receptor or an adrenomedullin receptor, depending on which members of the RAMP family are expressed. In the presence of this (RAMP2) protein, CRLR functions as an adrenomedullin receptor. The RAMP2 protein is involved in core glycosylation and transportation of adrenomedullin receptor to the cell surface. | ENSG00000131477 | receptor activity modifying protein 2 | 10266 | RAMP2 | NA |
| NA | ENSG00000271738 | NA | NA | NA | TRUE |
| NA | ENSG00000130300 | plasmalemma vesicle associated protein | 83483 | PLVAP | NA |
| NA | ENSG00000186994 | KN motif and ankyrin repeat domains 3 | 256949 | KANK3 | NA |
| The protein encoded by this gene is a member of the dual specificity protein phosphatase subfamily. These phosphatases inactivate their target kinases by dephosphorylating both the phosphoserine/threonine and phosphotyrosine residues. They negatively regulate members of the mitogen-activated protein (MAP) kinase superfamily (MAPK/ERK, SAPK/JNK, p38), which is associated with cellular proliferation and differentiation. Different members of the family of dual specificity phosphatases show distinct substrate specificities for various MAP kinases, different tissue distribution and subcellular localization, and different modes of inducibility of their expression by extracellular stimuli. This gene product inactivates SAPK/JNK and p38, is expressed predominantly in the adult brain, heart, and skeletal muscle, is localized in the cytoplasm, and is induced by nerve growth factor and insulin. An intronless pseudogene for DUSP8 is present on chromosome 10q11.2. | ENSG00000184545 | dual specificity phosphatase 8 | 1850 | DUSP8 | NA |
| NA | ENSG00000197291 | RAMP2 antisense RNA 1 | 100190938 | RAMP2-AS1 | NA |
| NA | ENSG00000122378 | family with sequence similarity 213 member A | 84293 | FAM213A | NA |
| This gene encodes cytosolic alanine aminotransaminase 1 (ALT1); also known as glutamate-pyruvate transaminase 1. This enzyme catalyzes the reversible transamination between alanine and 2-oxoglutarate to generate pyruvate and glutamate and, therefore, plays a key role in the intermediary metabolism of glucose and amino acids. Serum activity levels of this enzyme are routinely used as a biomarker of liver injury caused by drug toxicity, infection, alcohol, and steatosis. A related gene on chromosome 16 encodes a putative mitochondrial alanine aminotransaminase. | ENSG00000167701 | glutamic-pyruvate transaminase (alanine aminotransferase) | 2875 | GPT | NA |
| NA | ENSG00000226009 | KCNIP2 antisense RNA 1 | ENSG00000226009 | KCNIP2-AS1 | NA |
| NA | ENSG00000135447 | protein phosphatase 1 regulatory inhibitor subunit 1A | 5502 | PPP1R1A | NA |
| The protein encoded by this gene is a Golgi stack membrane protein that is involved in the creation of a precursor of the H antigen, which is required for the final step in the soluble A and B antigen synthesis pathway. This gene is one of two encoding the galactoside 2-L-fucosyltransferase enzyme. Mutations in this gene are a cause of the H-Bombay blood group. | ENSG00000174951 | fucosyltransferase 1 (H blood group) | 2523 | FUT1 | NA |
| NA | ENSG00000260912 | NA | ENSG00000260912 | RP11-363E7.4 | NA |
| The protein encoded by this gene coats lipid storage droplets in adipocytes, thereby protecting them until they can be broken down by hormone-sensitive lipase. The encoded protein is the major cAMP-dependent protein kinase substrate in adipocytes and, when unphosphorylated, may play a role in the inhibition of lipolysis. Alternatively spliced transcript variants varying in the 5’ UTR, but encoding the same protein, have been found for this gene. | ENSG00000166819 | perilipin 1 | 5346 | PLIN1 | NA |
| This gene encodes a protein belonging to the member of elastin microfibril interface-located (EMILIN) protein family. This family member is an extracellular matrix glycoprotein that can interfere with tumor angiogenesis and growth. It serves as a transforming growth factor beta antagonist and can interfere with the VEGF-A/VEGFR2 pathway. A related pseudogene has been identified on chromosome 6. | ENSG00000173269 | multimerin 2 | 79812 | MMRN2 | NA |
| This gene encodes a member of the NOTCH family of proteins. Members of this Type I transmembrane protein family share structural characteristics including an extracellular domain consisting of multiple epidermal growth factor-like (EGF) repeats, and an intracellular domain consisting of multiple different domain types. Notch signaling is an evolutionarily conserved intercellular signaling pathway that regulates interactions between physically adjacent cells through binding of Notch family receptors to their cognate ligands. The encoded preproprotein is proteolytically processed in the trans-Golgi network to generate two polypeptide chains that heterodimerize to form the mature cell-surface receptor. This receptor may play a role in vascular, renal and hepatic development. Mutations in this gene may be associated with schizophrenia. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that is proteolytically processed. | ENSG00000204301 | notch 4 | 4855 | NOTCH4 | NA |
| This gene encodes a member of the SLC29A/ENT transporter protein family. The encoded membrane protein catalyzes the reuptake of monoamines into presynaptic neurons, thus determining the intensity and duration of monoamine neural signaling. It has been shown to transport several compounds, including serotonin, dopamine, and the neurotoxin 1-methyl-4-phenylpyridinium. Alternative splicing results in multiple transcript variants. | ENSG00000164638 | solute carrier family 29 member 4 | 222962 | SLC29A4 | NA |
| This gene encodes a type I membrane glycoprotein containing two extracellular immunoglobulin domains, a transmembrane and a cytoplasmic domain. This gene is expressed by various cell types, including B cells, a subset of T cells, thymocytes, endothelial cells, and neurons. The encoded protein plays an important role in immunosuppression and regulation of anti-tumor activity. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000091972 | CD200 molecule | 4345 | CD200 | NA |
| This gene encodes a member of the FXYD family of transmembrane proteins. This particular protein encodes phosphohippolin, which likely affects the activity of Na,K-ATPase. Multiple alternatively spliced transcript variants encoding the same protein have been described. Related pseudogenes have been identified on chromosomes 10 and X. Read-through transcripts have been observed between this locus and the downstream sodium/potassium-transporting ATPase subunit gamma (FXYD2, GeneID 486) locus. | ENSG00000137726 | FXYD domain containing ion transport regulator 6 | 53826 | FXYD6 | NA |
| NA | ENSG00000254528 | NA | ENSG00000254528 | RP11-728F11.4 | NA |
| NA | ENSG00000182118 | family with sequence similarity 89 member A | 375061 | FAM89A | NA |
| The beta-adrenergic receptor kinase specifically phosphorylates the agonist-occupied form of the beta-adrenergic and related G protein-coupled receptors. Overall, the beta adrenergic receptor kinase 2 has 85% amino acid similarity with beta adrenergic receptor kinase 1, with the protein kinase catalytic domain having 95% similarity. These data suggest the existence of a family of receptor kinases which may serve broadly to regulate receptor function. | ENSG00000100077 | G protein-coupled receptor kinase 3 | 157 | GRK3 | NA |
| NA | ENSG00000222328 | RNA, U2 small nuclear 2, pseudogene | ENSG00000222328 | RNU2-2P | NA |
| This gene encodes a member of the family of voltage-gated potassium (Kv) channel-interacting proteins (KCNIPs), which belongs to the recoverin branch of the EF-hand superfamily. Members of the KCNIP family are small calcium binding proteins. They all have EF-hand-like domains, and differ from each other in the N-terminus. They are integral subunit components of native Kv4 channel complexes. They may regulate A-type currents, and hence neuronal excitability, in response to changes in intracellular calcium. Multiple alternatively spliced transcript variants encoding distinct isoforms have been identified from this gene. | ENSG00000120049 | potassium voltage-gated channel interacting protein 2 | 30819 | KCNIP2 | NA |
| The protein encoded by this gene belongs to the family of P-type cation transport ATPases, and to the subfamily of Na+/K+ -ATPases. Na+/K+ -ATPase is an integral membrane protein responsible for establishing and maintaining the electrochemical gradients of Na and K ions across the plasma membrane. These gradients are essential for osmoregulation, for sodium-coupled transport of a variety of organic and inorganic molecules, and for electrical excitability of nerve and muscle. This enzyme is composed of two subunits, a large catalytic subunit (alpha) and a smaller glycoprotein subunit (beta). The catalytic subunit of Na+/K+ -ATPase is encoded by multiple genes. This gene encodes an alpha 2 subunit. Mutations in this gene result in familial basilar or hemiplegic migraines, and in a rare syndrome known as alternating hemiplegia of childhood. | ENSG00000018625 | ATPase Na+/K+ transporting subunit alpha 2 | 477 | ATP1A2 | NA |
| The Notch signaling pathway is an intercellular signaling mechanism that is essential for proper embryonic development. Members of the Notch gene family encode transmembrane receptors that are critical for various cell fate decisions. The protein encoded by this gene is one of several ligands that activate Notch and related receptors. Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000184916 | jagged 2 | 3714 | JAG2 | NA |
| The protein encoded by this gene is a member of the dual specificity protein phosphatase subfamily. These phosphatases inactivate their target kinases by dephosphorylating both the phosphoserine/threonine and phosphotyrosine residues. They negatively regulate members of the mitogen-activated protein (MAP) kinase superfamily (MAPK/ERK, SAPK/JNK, p38), which are associated with cellular proliferation and differentiation. Different members of the family of dual specificity phosphatases show distinct substrate specificities for various MAP kinases, different tissue distribution and subcellular localization, and different modes of inducibility of their expression by extracellular stimuli. This gene product inactivates ERK1, ERK2 and JNK, is expressed in a variety of tissues, and is localized in the nucleus. Two alternatively spliced transcript variants, encoding distinct isoforms, have been observed for this gene. In addition, multiple polyadenylation sites have been reported. | ENSG00000120875 | dual specificity phosphatase 4 | 1846 | DUSP4 | NA |
| NA | ENSG00000164849 | G protein-coupled receptor 146 | 115330 | GPR146 | NA |
| NA | ENSG00000177685 | calcium release activated channel regulator 2B | 283229 | CRACR2B | NA |
| NA | ENSG00000257607 | NA | ENSG00000257607 | RP11-449P15.1 | NA |
| This gene encodes a protein which contains a C-terminal domain able to interact with the angiotension II (AT2) receptor and a large coiled-coil region allowing dimerization. Multiple alternatively spliced transcript variants encoding different isoforms have been found for this gene. One of the transcript variants has been shown to encode a mitochondrial protein that acts as a tumor suppressor and partcipates in AT2 signaling pathways. Other variants may encode nuclear or transmembrane proteins but it has not been determined whether they also participate in AT2 signaling pathways. | ENSG00000129422 | microtubule associated tumor suppressor 1 | 57509 | MTUS1 | NA |
| The gene is part of a 3-member transmembrane receptor kinase receptor family with a processed pseudogene distal on chromosome 15. The encoded protein is activated by the products of the growth arrest-specific gene 6 and protein S genes and is involved in controlling cell survival and proliferation, spermatogenesis, immunoregulation and phagocytosis. The encoded protein has also been identified as a cell entry factor for Ebola and Marburg viruses. | ENSG00000092445 | TYRO3 protein tyrosine kinase | 7301 | TYRO3 | NA |
| Members of the perilipin family, such as PLIN4, coat intracellular lipid storage droplets (Wolins et al., 2003 [PubMed 12840023]). | ENSG00000167676 | perilipin 4 | 729359 | PLIN4 | NA |
| NA | ENSG00000258603 | NA | ENSG00000258603 | RP3-414A15.10 | NA |
| NA | ENSG00000239911 | PRKAG2 antisense RNA 1 | ENSG00000239911 | PRKAG2-AS1 | NA |
| This gene is a member of the calcium/calmodulin-dependent protein kinase 1 family, a subfamily of the serine/threonine kinases. The encoded protein is a component of the calcium-regulated calmodulin-dependent protein kinase cascade. It has been associated with multiple processes including regulation of granulocyte function, activation of CREB-dependent gene transcription, aldosterone synthesis, differentiation and activation of neutrophil cells, and apoptosis of erythroleukemia cells. Alternatively spliced transcript variants encoding different isoforms of this gene have been described. | ENSG00000183049 | calcium/calmodulin dependent protein kinase ID | 57118 | CAMK1D | NA |
| NA | ENSG00000163053 | solute carrier family 16 member 14 | 151473 | SLC16A14 | NA |
| The protein encoded by this gene is a member of the L1 gene family of neural cell adhesion molecules. It is a neural recognition molecule that may be involved in signal transduction pathways. The deletion of one copy of this gene may be responsible for mental defects in patients with 3p- syndrome. This protein may also play a role in the growth of certain cancers. Alternate splicing results in both coding and non-coding variants. | ENSG00000134121 | cell adhesion molecule L1 like | 10752 | CHL1 | NA |
| NA | ENSG00000267992 | NA | ENSG00000267992 | CTB-189B5.3 | NA |
| NIPSNAP3B belongs to a family of proteins with putative roles in vesicular trafficking (Buechler et al., 2004 [PubMed 15177564]). | ENSG00000165028 | nipsnap homolog 3B | 55335 | NIPSNAP3B | NA |
| Due to its chemical instability and low solubility in aqueous solution, vitamin A requires cellular retinol-binding proteins (CRBPs), such as RBP7, for stability, internalization, intercellular transfer, homeostasis, and metabolism. | ENSG00000162444 | retinol binding protein 7 | 116362 | RBP7 | NA |
| The protein encoded by this gene is an adenosine receptor that belongs to the G-protein coupled receptor 1 family. There are 3 types of adenosine receptors, each with a specific pattern of ligand binding and tissue distribution, and together they regulate a diverse set of physiologic functions. The type A1 receptors inhibit adenylyl cyclase, and play a role in the fertilization process. Animal studies also suggest a role for A1 receptors in kidney function and ethanol intoxication. Transcript variants with alternative splicing in the 5’ UTR have been found for this gene. | ENSG00000163485 | adenosine A1 receptor | 134 | ADORA1 | NA |
| The protein encoded by this gene is a major apoprotein of the chylomicron. It binds to a specific liver and peripheral cell receptor, and is essential for the normal catabolism of triglyceride-rich lipoprotein constituents. This gene maps to chromosome 19 in a cluster with the related apolipoprotein C1 and C2 genes. Mutations in this gene result in familial dysbetalipoproteinemia, or type III hyperlipoproteinemia (HLP III), in which increased plasma cholesterol and triglycerides are the consequence of impaired clearance of chylomicron and VLDL remnants. Alternative splicing results in multiple transcript variants. | ENSG00000130203 | apolipoprotein E | 348 | APOE | NA |
| The protein encoded by this gene has a long and a short form, generated by use of alternative translational start codons. The long form is expressed in steroidogenic tissues such as testis, where it converts cholesteryl esters to free cholesterol for steroid hormone production. The short form is expressed in adipose tissue, among others, where it hydrolyzes stored triglycerides to free fatty acids. | ENSG00000079435 | lipase E, hormone sensitive type | 3991 | LIPE | NA |
| NA | ENSG00000203685 | stum, mechanosensory transduction mediator homolog | 375057 | STUM | NA |
| The protein encoded by this gene is a member of a G protein subfamily that mediates signal transduction in pertussis toxin-insensitive systms. This encoded protein may play a role in maintaining the ionic balance of perilymphatic and endolymphatic cochlear fluids. | ENSG00000128266 | G protein subunit alpha z | 2781 | GNAZ | NA |
| Members of the F-box protein family, such as FBXO27, are characterized by an approximately 40-amino acid F-box motif. SCF complexes, formed by SKP1 (MIM 601434), cullin (see CUL1; MIM 603134), and F-box proteins, act as protein-ubiquitin ligases. F-box proteins interact with SKP1 through the F box, and they interact with ubiquitination targets through other protein interaction domains (Jin et al., 2004 [PubMed 15520277]). | ENSG00000161243 | F-box protein 27 | 126433 | FBXO27 | NA |
| NA | ENSG00000229299 | NA | ENSG00000229299 | RP4-583P15.10 | NA |
| This gene encodes an adaptor protein and member of a cytoplasmic protein family involved in cell migration. The encoded protein contains a putative Src homology 2 (SH2) domain and guanine nucleotide exchange factor-like domain which allows this signaling protein to form a complex with scaffolding protein Crk-associated substrate. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000095370 | SH2 domain containing 3C | 10044 | SH2D3C | NA |
| The protein encoded by this gene belongs to the integrin alpha chain family. Integrins are heterodimeric integral membrane proteins composed of an alpha chain and a beta chain. They mediate a wide spectrum of cell-cell and cell-matrix interactions, and thus play a role in cell migration, morphologic development, differentiation, and metastasis. This protein functions as a receptor for the basement membrane protein laminin-1. It is mainly expressed in skeletal and cardiac muscles and may be involved in differentiation and migration processes during myogenesis. Defects in this gene are associated with congenital myopathy. Alternatively spliced transcript variants encoding different isoforms have been noted for this gene. | ENSG00000135424 | integrin subunit alpha 7 | 3679 | ITGA7 | NA |
| Synaptic vesicle membrane docking and fusion is mediated by SNAREs (soluble N-ethylmaleimide-sensitive factor attachment protein receptors) located on the vesicle membrane (v-SNAREs) and the target membrane (t-SNAREs). The assembled v-SNARE/t-SNARE complex consists of a bundle of four helices, one of which is supplied by v-SNARE and the other three by t-SNARE. For t-SNAREs on the plasma membrane, the protein syntaxin supplies one helix and the protein encoded by this gene contributes the other two. Therefore, this gene product is a presynaptic plasma membrane protein involved in the regulation of neurotransmitter release. Two alternative transcript variants encoding different protein isoforms have been described for this gene. | ENSG00000132639 | synaptosome associated protein 25 | 6616 | SNAP25 | NA |
| This gene encodes a member of the FAM69 family of cysteine-rich type II transmembrane proteins. These proteins localize to the endoplasmic reticulum but their specific functions are unknown. | ENSG00000165716 | family with sequence similarity 69 member B | 138311 | FAM69B | NA |
| NA | ENSG00000239218 | ribosomal protein S20 pseudogene 22 | ENSG00000239218 | RPS20P22 | NA |
| The protein encoded by this gene is a member of the intercellular adhesion molecule (ICAM) family. All ICAM proteins are type I transmembrane glycoproteins, contain 2-9 immunoglobulin-like C2-type domains, and bind to the leukocyte adhesion LFA-1 protein. This protein may play a role in lymphocyte recirculation by blocking LFA-1-dependent cell adhesion. It mediates adhesive interactions important for antigen-specific immune response, NK-cell mediated clearance, lymphocyte recirculation, and other cellular interactions important for immune response and surveillance. Several transcript variants encoding the same protein have been found for this gene. | ENSG00000108622 | intercellular adhesion molecule 2 | 3384 | ICAM2 | NA |
| NA | ENSG00000214578 | high mobility group nucleosomal binding domain 2 pseudogene 15 | ENSG00000214578 | HMGN2P15 | NA |
| NA | ENSG00000205959 | NA | ENSG00000205959 | RP11-689P11.2 | NA |
| This gene encodes a serine/threonine protein kinase. Although this gene product is similar to serum- and glucocorticoid-induced protein kinase (SGK), this gene is not induced by serum or glucocorticoids. This gene is induced in response to signals that activate phosphatidylinositol 3-kinase, which is also true for SGK. Alternative splicing results in multiple transcript variants. | ENSG00000101049 | SGK2, serine/threonine kinase 2 | 10110 | SGK2 | NA |
| The sphingolipid metabolite sphingosine-1-phosphate promotes cell proliferation and survival, whereas its precursor, sphingosine, has the opposite effect. The ceramidase ACER2 hydrolyzes very long chain ceramides to generate sphingosine (Xu et al., 2006 [PubMed 16940153]). | ENSG00000177076 | alkaline ceramidase 2 | 340485 | ACER2 | NA |
| NA | ENSG00000176485 | phospholipase A2 group XVI | 11145 | PLA2G16 | NA |
| NA | ENSG00000256604 | NA | NA | NA | TRUE |
| NA | ENSG00000272678 | NA | ENSG00000272678 | RP11-797D24.4 | NA |
| NA | ENSG00000257622 | NA | ENSG00000257622 | RP11-44N21.4 | NA |
| FABP4 encodes the fatty acid binding protein found in adipocytes. Fatty acid binding proteins are a family of small, highly conserved, cytoplasmic proteins that bind long-chain fatty acids and other hydrophobic ligands. It is thought that FABPs roles include fatty acid uptake, transport, and metabolism. | ENSG00000170323 | fatty acid binding protein 4 | 2167 | FABP4 | NA |
| This gene encodes a protein belonging to the GTP-binding superfamily and to the immuno-associated nucleotide (IAN) subfamily of nucleotide-binding proteins. In humans, the IAN subfamily genes are located in a cluster at 7q36.1. This gene encodes an antiapoptotic protein that functions in T-cell survival. Polymorphisms in this gene are associated with systemic lupus erythematosus. Read-through transcription exists between this gene and the neighboring upstream GIMAP1 (GTPase, IMAP family member 1) gene. | ENSG00000196329 | GTPase, IMAP family member 5 | 55340 | GIMAP5 | NA |
| This gene likely encodes a member of the carboxypeptidase family of proteins. Cloning of a comparable locus in mouse indicates that the encoded protein contains a discoidin domain and a carboxypeptidase domain, but the protein appears to lack residues necessary for carboxypeptidase activity. | ENSG00000088882 | carboxypeptidase X (M14 family), member 1 | 56265 | CPXM1 | NA |
| NA | ENSG00000256661 | A2ML1 antisense RNA 1 | ENSG00000256661 | A2ML1-AS1 | NA |
| Acetyl-CoA carboxylase (ACC) is a complex multifunctional enzyme system. ACC is a biotin-containing enzyme which catalyzes the carboxylation of acetyl-CoA to malonyl-CoA, the rate-limiting step in fatty acid synthesis. ACC-beta is thought to control fatty acid oxidation by means of the ability of malonyl-CoA to inhibit carnitine-palmitoyl-CoA transferase I, the rate-limiting step in fatty acid uptake and oxidation by mitochondria. ACC-beta may be involved in the regulation of fatty acid oxidation, rather than fatty acid biosynthesis. There is evidence for the presence of two ACC-beta isoforms. | ENSG00000076555 | acetyl-CoA carboxylase beta | 32 | ACACB | NA |
| This gene encodes a nuclear protein belonging to the hairy and enhancer of split-related (HESR) family of basic helix-loop-helix (bHLH)-type transcriptional repressors. Expression of this gene is induced by the Notch and c-Jun signal transduction pathways. Two similar and redundant genes in mouse are required for embryonic cardiovascular development, and are also implicated in neurogenesis and somitogenesis. Alternative splicing results in multiple transcript variants. | ENSG00000164683 | hes related family bHLH transcription factor with YRPW motif 1 | 23462 | HEY1 | NA |
| NA | ENSG00000225792 | NA | ENSG00000225792 | AC004540.4 | NA |
| NA | ENSG00000139597 | NEDD4 binding protein 2-like 1 | 90634 | N4BP2L1 | NA |
| NA | ENSG00000268358 | NA | NA | NA | TRUE |
| NA | ENSG00000256633 | NA | ENSG00000256633 | RP11-169D4.2 | NA |
| This gene encodes a protein containing several protein-protein interaction domains, including ankyrin-like repeats, a coiled-coil domain, and an ATP/GTP-binding motif. The encoded protein interacts with alpha-synuclein in neuronal tissue and may play a role in the formation of cytoplasmic inclusions and neurodegeneration. A mutation in this gene has been associated with Parkinson’s disease. Alternative splicing results in multiple transcript variants. | ENSG00000064692 | synuclein alpha interacting protein | 9627 | SNCAIP | NA |
| NA | ENSG00000105808 | uncharacterized LOC102724229 | 102724229 | LOC102724229 | NA |
| This gene encodes a member of the GAP1 family of GTPase-activating proteins that suppresses the Ras/mitogen-activated protein kinase pathway in response to Ca(2+). Stimuli that increase intracellular Ca(2+) levels result in the translocation of this protein to the plasma membrane, where it activates Ras GTPase activity. Consequently, Ras is converted from the active GTP-bound state to the inactive GDP-bound state and no longer activates downstream pathways that regulate gene expression, cell growth, and differentiation. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000105808 | RAS p21 protein activator 4 | 10156 | RASA4 | NA |
| The protein encoded by this gene is a member of the G-protein coupled receptor family 2. This protein is a receptor for parathyroid hormone (PTH) and for parathyroid hormone-like hormone (PTHLH). The activity of this receptor is mediated by G proteins which activate adenylyl cyclase and also a phosphatidylinositol-calcium second messenger system. Defects in this receptor are known to be the cause of Jansen’s metaphyseal chondrodysplasia (JMC), chondrodysplasia Blomstrand type (BOCD), as well as enchodromatosis. Two transcript variants encoding the same protein have been found for this gene. | ENSG00000160801 | parathyroid hormone 1 receptor | 5745 | PTH1R | NA |
| The membrane-associated protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intracellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the ABC1 subfamily. Members of the ABC1 subfamily comprise the only major ABC subfamily found exclusively in multicellular eukaryotes. The full transporter encoded by this gene may be involved in development of resistance to xenobiotics and engulfment during programmed cell death. | ENSG00000167972 | ATP binding cassette subfamily A member 3 | 21 | ABCA3 | NA |
| In the mouse, Nkd is a Dishevelled (see DVL1; MIM 601365)-binding protein that functions as a negative regulator of the Wnt (see WNT1; MIM 164820)-beta-catenin (see MIM 116806)-Tcf (see MIM 602272) signaling pathway. | ENSG00000140807 | naked cuticle homolog 1 | 85407 | NKD1 | NA |
| NA | ENSG00000237248 | long intergenic non-protein coding RNA 987 | 100499405 | LINC00987 | NA |
| This gene encodes a member of the bombesin-like family of neuropeptides, which negatively regulate eating behavior. The encoded protein may regulate colonic smooth muscle contraction through binding to its cognate receptor, the neuromedin B receptor (NMBR). Polymorphisms of this gene may be associated with hunger, weight gain and obesity. Alternative splicing results in multiple transcript variants. | ENSG00000197696 | neuromedin B | 4828 | NMB | NA |
| This gene encodes a transcription factor that is a member of the nuclear receptor subfamily 1. The encoded protein is a ligand-sensitive transcription factor that negatively regulates the expression of core clock proteins. In particular this protein represses the circadian clock transcription factor aryl hydrocarbon receptor nuclear translocator-like protein 1 (ARNTL). This protein may also be involved in regulating genes that function in metabolic, inflammatory and cardiovascular processes. | ENSG00000126368 | nuclear receptor subfamily 1 group D member 1 | 9572 | NR1D1 | NA |
| The protein encoded by this gene is a member of the type 3 G protein-coupled receptor family. Members of this superfamily are characterized by a signature 7-transmembrane domain motif. The specific function of this protein is unknown; however, this protein may mediate the cellular effects of retinoic acid on the G protein signal transduction cascade. Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000170412 | G protein-coupled receptor class C group 5 member C | 55890 | GPRC5C | NA |
| APM2 gene is exclusively expressed in adipose tissue. Its function is currently unknown. | ENSG00000148671 | adipogenesis regulatory factor | 10974 | ADIRF | NA |
| Members of the perilipin family, such as PLIN5, coat intracellular lipid storage droplets and protect them from lipolytic degradation (Dalen et al., 2007 [PubMed 17234449]). | ENSG00000214456 | perilipin 5 | 440503 | PLIN5 | NA |
| NA | ENSG00000117461 | phosphoinositide-3-kinase regulatory subunit 3 | 8503 | PIK3R3 | NA |
| The protein encoded by this gene belongs to the cyclic nucleotide phosphodiesterase (PDE) family, and PDE1 subfamily. Members of the PDE1 family are calmodulin-dependent PDEs that are stimulated by a calcium-calmodulin complex. This PDE has dual-specificity for the second messengers, cAMP and cGMP, with a preference for cGMP as a substrate. cAMP and cGMP function as key regulators of many important physiological processes. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | ENSG00000123360 | phosphodiesterase 1B | 5153 | PDE1B | NA |
| The protein encoded by this gene belongs to the dpy-19 family. It is highly expressed in testis, and is required for sperm head elongation and acrosome formation during spermatogenesis. Mutations in this gene are associated with an infertility disorder, spermatogenic failure type 9 (SPGF9). | ENSG00000177990 | dpy-19 like 2 | 283417 | DPY19L2 | NA |
| The transmembrane semaphorin SEMA6A is expressed in developing neural tissue and is required for proper development of the thalamocortical projection (Leighton et al., 2001 [PubMed 11242070]). | ENSG00000092421 | semaphorin 6A | 57556 | SEMA6A | NA |
| The protein encoded by this gene contains six PDZ domains and shares sequence similarity with pro-interleukin-16 (pro-IL-16). Like pro-IL-16, the encoded protein localizes to the endoplasmic reticulum and is thought to be cleaved by a caspase to produce a secreted peptide containing two PDZ domains. In addition, this gene is upregulated in primary prostate tumors and may be involved in the early stages of prostate tumorigenesis. | ENSG00000133401 | PDZ domain containing 2 | 23037 | PDZD2 | NA |
| This gene encodes a member of the ankyrin family of proteins that link the integral membrane proteins to the underlying spectrin-actin cytoskeleton. Ankyrins play key roles in activities such as cell motility, activation, proliferation, contact and the maintenance of specialized membrane domains. Most ankyrins are typically composed of three structural domains: an amino-terminal domain containing multiple ankyrin repeats; a central region with a highly conserved spectrin binding domain; and a carboxy-terminal regulatory domain which is the least conserved and subject to variation. The protein encoded by this gene is required for targeting and stability of Na/Ca exchanger 1 in cardiomyocytes. Mutations in this gene cause long QT syndrome 4 and cardiac arrhythmia syndrome. Multiple transcript variants encoding different isoforms have been described. | ENSG00000145362 | ankyrin 2, neuronal | 287 | ANK2 | NA |
| NA | ENSG00000156750 | NA | NA | NA | TRUE |
| This gene encodes a member of the regulator of calcineurin (RCAN) protein family. These proteins play a role in many physiological processes by binding to the catalytic domain of calcineurin A, inhibiting calcineurin-mediated nuclear translocation of the transcription factor NFATC1. Expression of this gene in skin fibroblasts is upregulated by thyroid hormone, and the encoded protein may also play a role in endothelial cell function and angiogenesis. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | ENSG00000172348 | regulator of calcineurin 2 | 10231 | RCAN2 | NA |
| Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. This gene is one member of a family of beta-spectrin genes. The encoded protein localizes to the nuclear matrix, PML nuclear bodies, and cytoplasmic vesicles. A highly similar gene in the mouse is required for localization of specific membrane proteins in polarized regions of neurons. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000160460 | spectrin beta, non-erythrocytic 4 | 57731 | SPTBN4 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",16,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[17,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| query | X_id | name | summary | symbol | notfound |
|---|---|---|---|---|---|
| ENSG00000137392 | 1208 | colipase | The protein encoded by this gene is a cofactor needed by pancreatic lipase for efficient dietary lipid hydrolysis. It binds to the C-terminal, non-catalytic domain of lipase, thereby stabilizing an active conformation and considerably increasing the overall hydrophobic binding site. The gene product allows lipase to anchor noncovalently to the surface of lipid micelles, counteracting the destabilizing influence of intestinal bile salts. This cofactor is only expressed in pancreatic acinar cells, suggesting regulation of expression by tissue-specific elements. Three transcript variants encoding different isoforms have been found for this gene. | CLPS | NA |
| ENSG00000185615 | 64714 | protein disulfide isomerase family A member 2 | Protein disulfide isomerases (EC 5.3.4.1), such as PDIP, are endoplasmic reticulum (ER) resident proteins that catalyze protein folding and thiol-disulfide interchange reactions (Desilva et al., 1996 [PubMed 8561901]). | PDIA2 | NA |
| ENSG00000172023 | 5968 | regenerating family member 1 beta | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV based on the primary structures of the encoded proteins. This gene encodes a protein secreted by the exocrine pancreas that is highly similar to the REG1A protein. The related REG1A protein is associated with islet cell regeneration and diabetogenesis, and may be involved in pancreatic lithogenesis. Reg family members REG1A, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | REG1B | NA |
| ENSG00000168928 | 440387 | chymotrypsinogen B2 | NA | CTRB2 | NA |
| ENSG00000187021 | 5407 | pancreatic lipase related protein 1 | NA | PNLIPRP1 | NA |
| ENSG00000219073 | 23436 | chymotrypsin like elastase family member 3B | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3B has little elastolytic activity. Like most of the human elastases, elastase 3B is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3B preferentially cleaves proteins after alanine residues. Elastase 3B may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1, and excretion of this protein in fecal material is frequently used as a measure of pancreatic function in clinical assays. | CELA3B | NA |
| ENSG00000175535 | 5406 | pancreatic lipase | This gene is a member of the lipase gene family. It encodes a carboxyl esterase that hydrolyzes insoluble, emulsified triglycerides, and is essential for the efficient digestion of dietary fats. This gene is expressed specifically in the pancreas. | PNLIP | NA |
| ENSG00000250606 | NA | NA | NA | NA | TRUE |
| ENSG00000142789 | 10136 | chymotrypsin like elastase family member 3A | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. | CELA3A | NA |
| ENSG00000179751 | 342898 | syncollin | NA | SYCN | NA |
| ENSG00000168925 | 1504 | chymotrypsinogen B1 | The protein encoded by this gene is one of a family of serine proteases that is secreted into the gastrointestinal tract as an inactive precursor, which is activated by proteolytic cleavage with trypsin. | CTRB1 | NA |
| ENSG00000172016 | 5068 | regenerating family member 3 alpha | This gene encodes a pancreatic secretory protein that may be involved in cell proliferation or differentiation. It has similarity to the C-type lectin superfamily. The enhanced expression of this gene is observed during pancreatic inflammation and liver carcinogenesis. The mature protein also functions as an antimicrobial protein with antibacterial activity. Alternate splicing results in multiple transcript variants that encode the same protein. | REG3A | NA |
| ENSG00000076864 | 5909 | RAP1 GTPase activating protein | This gene encodes a type of GTPase-activating-protein (GAP) that down-regulates the activity of the ras-related RAP1 protein. RAP1 acts as a molecular switch by cycling between an inactive GDP-bound form and an active GTP-bound form. The product of this gene, RAP1GAP, promotes the hydrolysis of bound GTP and hence returns RAP1 to the inactive state whereas other proteins, guanine nucleotide exchange factors (GEFs), act as RAP1 activators by facilitating the conversion of RAP1 from the GDP- to the GTP-bound form. In general, ras subfamily proteins, such as RAP1, play key roles in receptor-linked signaling pathways that control cell growth and differentiation. RAP1 plays a role in diverse processes such as cell proliferation, adhesion, differentiation, and embryogenesis. Alternative splicing results in multiple transcript variants encoding distinct proteins. | RAP1GAP | NA |
| ENSG00000125414 | 4620 | myosin, heavy chain 2, skeletal muscle, adult | Myosins are actin-based motor proteins that function in the generation of mechanical force in eukaryotic cells. Muscle myosins are heterohexamers composed of 2 myosin heavy chains and 2 pairs of nonidentical myosin light chains. This gene encodes a member of the class II or conventional myosin heavy chains, and functions in skeletal muscle contraction. This gene is found in a cluster of myosin heavy chain genes on chromosome 17. A mutation in this gene results in inclusion body myopathy-3. Multiple alternatively spliced variants, encoding the same protein, have been identified. | MYH2 | NA |
| ENSG00000091704 | 1357 | carboxypeptidase A1 | This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. | CPA1 | NA |
| ENSG00000204983 | 5644 | protease, serine 1 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | PRSS1 | NA |
| ENSG00000112210 | 51715 | RAB23, member RAS oncogene family | This gene encodes a small GTPase of the Ras superfamily. Rab proteins are involved in the regulation of diverse cellular functions associated with intracellular membrane trafficking, including autophagy and immune response to bacterial infection. The encoded protein may play a role in central nervous system development by antagonizing sonic hedgehog signaling. Disruption of this gene has been implicated in Carpenter syndrome as well as cancer. Alternative splicing results in multiple transcript variants. | RAB23 | NA |
| ENSG00000117013 | 9132 | potassium voltage-gated channel subfamily Q member 4 | The protein encoded by this gene forms a potassium channel that is thought to play a critical role in the regulation of neuronal excitability, particularly in sensory cells of the cochlea. The current generated by this channel is inhibited by M1 muscarinic acetylcholine receptors and activated by retigabine, a novel anti-convulsant drug. The encoded protein can form a homomultimeric potassium channel or possibly a heteromultimeric channel in association with the protein encoded by the KCNQ3 gene. Defects in this gene are a cause of nonsyndromic sensorineural deafness type 2 (DFNA2), an autosomal dominant form of progressive hearing loss. Two transcript variants encoding different isoforms have been found for this gene. | KCNQ4 | NA |
| ENSG00000134871 | 1284 | collagen type IV alpha 2 | This gene encodes one of the six subunits of type IV collagen, the major structural component of basement membranes. The C-terminal portion of the protein, known as canstatin, is an inhibitor of angiogenesis and tumor growth. Like the other members of the type IV collagen gene family, this gene is organized in a head-to-head conformation with another type IV collagen gene so that each gene pair shares a common promoter. | COL4A2 | NA |
| ENSG00000142615 | 63036 | chymotrypsin like elastase family member 2A | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Like most of the human elastases, elastase 2A is secreted from the pancreas as a zymogen. In other species, elastase 2A has been shown to preferentially cleave proteins after leucine, methionine, and phenylalanine residues. | CELA2A | NA |
| ENSG00000172403 | 171024 | synaptopodin 2 | NA | SYNPO2 | NA |
| ENSG00000238133 | 339751 | MLK7 antisense RNA 1 | NA | MLK7-AS1 | NA |
| ENSG00000169347 | 2813 | glycoprotein 2 | This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. | GP2 | NA |
| ENSG00000164949 | 2669 | GTP binding protein overexpressed in skeletal muscle | The protein encoded by this gene belongs to the RAD/GEM family of GTP-binding proteins. It is associated with the inner face of the plasma membrane and could play a role as a regulatory protein in receptor-mediated signal transduction. Alternative splicing occurs at this locus and two transcript variants encoding the same protein have been identified. | GEM | NA |
| ENSG00000167600 | 29785 | cytochrome P450 family 2 subfamily S member 1 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum. In rodents, the homologous protein has been shown to metabolize certain carcinogens; however, the specific function of the human protein has not been determined. | CYP2S1 | NA |
| ENSG00000115590 | 7850 | interleukin 1 receptor type 2 | The protein encoded by this gene is a cytokine receptor that belongs to the interleukin 1 receptor family. This protein binds interleukin alpha (IL1A), interleukin beta (IL1B), and interleukin 1 receptor, type I(IL1R1/IL1RA), and acts as a decoy receptor that inhibits the activity of its ligands. Interleukin 4 (IL4) is reported to antagonize the activity of interleukin 1 by inducing the expression and release of this cytokine. This gene and three other genes form a cytokine receptor gene cluster on chromosome 2q12. Alternative splicing results in multiple transcript variants and protein isoforms. Alternative splicing produces both membrane-bound and soluble proteins. A soluble protein is also produced by proteolytic cleavage. | IL1R2 | NA |
| ENSG00000129521 | 112399 | egl-9 family hypoxia inducible factor 3 | NA | EGLN3 | NA |
| ENSG00000124145 | 6385 | syndecan 4 | The protein encoded by this gene is a transmembrane (type I) heparan sulfate proteoglycan that functions as a receptor in intracellular signaling. The encoded protein is found as a homodimer and is a member of the syndecan proteoglycan family. This gene is found on chromosome 20, while a pseudogene has been found on chromosome 22. | SDC4 | NA |
| ENSG00000174171 | 105370792 | uncharacterized LOC105370792 | NA | LOC105370792 | NA |
| ENSG00000170890 | 5319 | phospholipase A2 group IB | This gene encodes a secreted member of the phospholipase A2 (PLA2) class of enzymes, which is produced by the pancreatic acinar cells. The encoded calcium-dependent enzyme catalyzes the hydrolysis of the sn-2 position of membrane glycerophospholipids to release arachidonic acid (AA) and lysophospholipids. AA is subsequently converted by downstream metabolic enzymes to several bioactive lipophilic compounds (eicosanoids), including prostaglandins (PGs) and leukotrienes (LTs). The enzyme may be involved in several physiological processes including cell contraction, cell proliferation and pathological response. | PLA2G1B | NA |
| ENSG00000067191 | 782 | calcium voltage-gated channel auxiliary subunit beta 1 | The protein encoded by this gene belongs to the calcium channel beta subunit family. It plays an important role in the calcium channel by modulating G protein inhibition, increasing peak calcium current, controlling the alpha-1 subunit membrane targeting and shifting the voltage dependence of activation and inactivation. Alternative splicing occurs at this locus and three transcript variants encoding three distinct isoforms have been identified. | CACNB1 | NA |
| ENSG00000161281 | 1346 | cytochrome c oxidase subunit 7A1 | Cytochrome c oxidase (COX), the terminal component of the mitochondrial respiratory chain, catalyzes the electron transfer from reduced cytochrome c to oxygen. This component is a heteromeric complex consisting of 3 catalytic subunits encoded by mitochondrial genes and multiple structural subunits encoded by nuclear genes. The mitochondrially-encoded subunits function in electron transfer, and the nuclear-encoded subunits may function in the regulation and assembly of the complex. This nuclear gene encodes polypeptide 1 (muscle isoform) of subunit VIIa and the polypeptide 1 is present only in muscle tissues. Other polypeptides of subunit VIIa are present in both muscle and nonmuscle tissues, and are encoded by different genes. | COX7A1 | NA |
| ENSG00000174136 | 285704 | repulsive guidance molecule family member b | RGMB is a glycosylphosphatidylinositol (GPI)-anchored member of the repulsive guidance molecule family (see RGMA, MIM 607362) and contributes to the patterning of the developing nervous system (Samad et al., 2005 [PubMed 15671031]). | RGMB | NA |
| ENSG00000169442 | 1043 | CD52 molecule | NA | CD52 | NA |
| ENSG00000110880 | 23603 | coronin 1C | This gene encodes a member of the WD repeat protein family. WD repeats are minimally conserved regions of approximately 40 amino acids typically bracketed by gly-his and trp-asp (GH-WD), which may facilitate formation of heterotrimeric or multiprotein complexes. Members of this family are involved in a variety of cellular processes, including cell cycle progression, signal transduction, apoptosis, and gene regulation. Three transcript variants encoding two different isoforms have been found for this gene. | CORO1C | NA |
| ENSG00000214290 | 120376 | colorectal cancer associated 2 | NA | COLCA2 | NA |
| ENSG00000137094 | 25822 | DnaJ heat shock protein family (Hsp40) member B5 | DNAJB5 belongs to the evolutionarily conserved DNAJ/HSP40 protein family. For background information on the DNAJ family, see MIM 608375. | DNAJB5 | NA |
| ENSG00000140511 | 145864 | hyaluronan and proteoglycan link protein 3 | This gene belongs to the hyaluronan and proteoglycan binding link protein gene family. The protein encoded by this gene may function in hyaluronic acid binding and cell adhesion. | HAPLN3 | NA |
| ENSG00000158270 | 81035 | collectin subfamily member 12 | This gene encodes a member of the C-lectin family, proteins that possess collagen-like sequences and carbohydrate recognition domains. This protein is a scavenger receptor, a cell surface glycoprotein that displays several functions associated with host defense. It can bind to carbohydrate antigens on microorganisms, facilitating their recognition and removal. It also mediates the recognition, internalization, and degradation of oxidatively modified low density lipoprotein by vascular endothelial cells. | COLEC12 | NA |
| ENSG00000141086 | 1506 | chymotrypsin like | NA | CTRL | NA |
| ENSG00000142156 | 1291 | collagen type VI alpha 1 | The collagens are a superfamily of proteins that play a role in maintaining the integrity of various tissues. Collagens are extracellular matrix proteins and have a triple-helical domain as their common structural element. Collagen VI is a major structural component of microfibrils. The basic structural unit of collagen VI is a heterotrimer of the alpha1(VI), alpha2(VI), and alpha3(VI) chains. The alpha2(VI) and alpha3(VI) chains are encoded by the COL6A2 and COL6A3 genes, respectively. The protein encoded by this gene is the alpha 1 subunit of type VI collagen (alpha1(VI) chain). Mutations in the genes that code for the collagen VI subunits result in the autosomal dominant disorder, Bethlem myopathy. | COL6A1 | NA |
| ENSG00000185532 | 5592 | protein kinase, cGMP-dependent, type I | Mammals have three different isoforms of cyclic GMP-dependent protein kinase (Ialpha, Ibeta, and II). These PRKG isoforms act as key mediators of the nitric oxide/cGMP signaling pathway and are important components of many signal transduction processes in diverse cell types. This PRKG1 gene on human chromosome 10 encodes the soluble Ialpha and Ibeta isoforms of PRKG by alternative transcript splicing. A separate gene on human chromosome 4, PRKG2, encodes the membrane-bound PRKG isoform II. The PRKG1 proteins play a central role in regulating cardiovascular and neuronal functions in addition to relaxing smooth muscle tone, preventing platelet aggregation, and modulating cell growth. This gene is most strongly expressed in all types of smooth muscle, platelets, cerebellar Purkinje cells, hippocampal neurons, and the lateral amygdala. Isoforms Ialpha and Ibeta have identical cGMP-binding and catalytic domains but differ in their leucine/isoleucine zipper and autoinhibitory sequences and therefore differ in their dimerization substrates and kinase enzyme activity. | PRKG1 | NA |
| ENSG00000070190 | 27071 | dual adaptor of phosphotyrosine and 3-phosphoinositides 1 | NA | DAPP1 | NA |
| ENSG00000023902 | 51177 | pleckstrin homology domain containing O1 | NA | PLEKHO1 | NA |
| ENSG00000244945 | 101928445 | uncharacterized LOC101928445 | NA | LOC101928445 | NA |
| ENSG00000224597 | 102724316 | SVIL antisense RNA 1 | NA | SVIL-AS1 | NA |
| ENSG00000245864 | ENSG00000245864 | NA | NA | CTC-467M3.1 | NA |
| ENSG00000135346 | 1081 | glycoprotein hormones, alpha polypeptide | The four human glycoprotein hormones chorionic gonadotropin (CG), luteinizing hormone (LH), follicle stimulating hormone (FSH), and thyroid stimulating hormone (TSH) are dimers consisting of alpha and beta subunits that are associated noncovalently. The alpha subunits of these hormones are identical, however, their beta chains are unique and confer biological specificity. The protein encoded by this gene is the alpha subunit and belongs to the glycoprotein hormones alpha chain family. Two transcript variants encoding different isoforms have been found for this gene. | CGA | NA |
| ENSG00000118496 | 84085 | F-box protein 30 | This gene encodes a member of the F-box protein family which is characterized by an approximately 40 amino acid motif, the F-box. The F-box proteins constitute one of the four subunits of the ubiquitin protein ligase complex called SCFs (SKP1-cullin-F-box), which function in phosphorylation-dependent ubiquitination. The F-box proteins are divided into 3 classes: Fbws containing WD-40 domains, Fbls containing leucine-rich repeats, and Fbxs containing either different protein-protein interaction modules or no recognizable motifs. The protein encoded by this gene belongs to the Fbxs class and it is upregulated in nasopharyngeal carcinoma. | FBXO30 | NA |
| ENSG00000158516 | 1358 | carboxypeptidase A2 | Three different forms of human pancreatic procarboxypeptidase A have been isolated. The encoded protein represents the A2 form, which is a monomeric protein with different biochemical properties from the A1 and A3 forms. The A2 form of pancreatic procarboxypeptidase acts on aromatic C-terminal residues and is a secreted protein. | CPA2 | NA |
| ENSG00000213639 | 5500 | protein phosphatase 1 catalytic subunit beta | The protein encoded by this gene is one of the three catalytic subunits of protein phosphatase 1 (PP1). PP1 is a serine/threonine specific protein phosphatase known to be involved in the regulation of a variety of cellular processes, such as cell division, glycogen metabolism, muscle contractility, protein synthesis, and HIV-1 viral transcription. Mouse studies suggest that PP1 functions as a suppressor of learning and memory. Two alternatively spliced transcript variants encoding distinct isoforms have been observed. | PPP1CB | NA |
| ENSG00000148516 | 6935 | zinc finger E-box binding homeobox 1 | This gene encodes a zinc finger transcription factor. The encoded protein likely plays a role in transcriptional repression of interleukin 2. Mutations in this gene have been associated with posterior polymorphous corneal dystrophy-3 and late-onset Fuchs endothelial corneal dystrophy. Alternatively spliced transcript variants encoding different isoforms have been described. | ZEB1 | NA |
| ENSG00000153002 | 1360 | carboxypeptidase B1 | Three different procarboxypeptidases A and two different procarboxypeptidases B have been isolated. The B1 and B2 forms differ from each other mainly in isoelectric point. Carboxypeptidase B1 is a highly tissue-specific protein and is a useful serum marker for acute pancreatitis and dysfunction of pancreatic transplants. It is not elevated in pancreatic carcinoma. | CPB1 | NA |
| ENSG00000112208 | 9532 | BCL2 associated athanogene 2 | BAG proteins compete with Hip for binding to the Hsc70/Hsp70 ATPase domain and promote substrate release. All the BAG proteins have an approximately 45-amino acid BAG domain near the C terminus but differ markedly in their N-terminal regions. The predicted BAG2 protein contains 211 amino acids. The BAG domains of BAG1, BAG2, and BAG3 interact specifically with the Hsc70 ATPase domain in vitro and in mammalian cells. All 3 proteins bind with high affinity to the ATPase domain of Hsc70 and inhibit its chaperone activity in a Hip-repressible manner. | BAG2 | NA |
| ENSG00000100342 | 8542 | apolipoprotein L1 | This gene encodes a secreted high density lipoprotein which binds to apolipoprotein A-I. Apolipoprotein A-I is a relatively abundant plasma protein and is the major apoprotein of HDL. It is involved in the formation of most cholesteryl esters in plasma and also promotes efflux of cholesterol from cells. This apolipoprotein L family member may play a role in lipid exchange and transport throughout the body, as well as in reverse cholesterol transport from peripheral cells to the liver. Several different transcript variants encoding different isoforms have been found for this gene. | APOL1 | NA |
| ENSG00000086015 | 23139 | microtubule associated serine/threonine kinase 2 | NA | MAST2 | NA |
| ENSG00000175899 | 2 | alpha-2-macroglobulin | Alpha-2-macroglobulin is a protease inhibitor and cytokine transporter. It inhibits many proteases, including trypsin, thrombin and collagenase. A2M is implicated in Alzheimer disease (AD) due to its ability to mediate the clearance and degradation of A-beta, the major component of beta-amyloid deposits. | A2M | NA |
| ENSG00000152268 | NA | NA | NA | NA | TRUE |
| ENSG00000213144 | ENSG00000213144 | NA | NA | RP11-64B16.2 | NA |
| ENSG00000124831 | 9208 | leucine rich repeat (in FLII) interacting protein 1 | NA | LRRFIP1 | NA |
| ENSG00000121057 | 8165 | A-kinase anchoring protein 1 | The A-kinase anchor proteins (AKAPs) are a group of structurally diverse proteins, which have the common function of binding to the regulatory subunit of protein kinase A (PKA) and confining the holoenzyme to discrete locations within the cell. This gene encodes a member of the AKAP family. The encoded protein binds to type I and type II regulatory subunits of PKA and anchors them to the mitochondrion. This protein is speculated to be involved in the cAMP-dependent signal transduction pathway and in directing RNA to a specific cellular compartment. | AKAP1 | NA |
| ENSG00000173641 | 27129 | heat shock protein family B (small) member 7 | NA | HSPB7 | NA |
| ENSG00000164078 | 4486 | macrophage stimulating 1 receptor | This gene encodes a cell surface receptor for macrophage-stimulating protein (MSP) with tyrosine kinase activity. The mature form of this protein is a heterodimer of disulfide-linked alpha and beta subunits, generated by proteolytic cleavage of a single-chain precursor. The beta subunit undergoes tyrosine phosphorylation upon stimulation by MSP. This protein is expressed on the ciliated epithelia of the mucociliary transport apparatus of the lung, and together with MSP, thought to be involved in host defense. Alternative splicing generates multiple transcript variants encoding different isoforms that may undergo similar proteolytic processing. | MST1R | NA |
| ENSG00000187498 | 1282 | collagen type IV alpha 1 chain | This gene encodes a type IV collagen alpha protein. Type IV collagen proteins are integral components of basement membranes. This gene shares a bidirectional promoter with a paralogous gene on the opposite strand. The protein consists of an amino-terminal 7S domain, a triple-helix forming collagenous domain, and a carboxy-terminal non-collagenous domain. It functions as part of a heterotrimer and interacts with other extracellular matrix components such as perlecans, proteoglycans, and laminins. In addition, proteolytic cleavage of the non-collagenous carboxy-terminal domain results in a biologically active fragment known as arresten, which has anti-angiogenic and tumor suppressor properties. Mutations in this gene cause porencephaly, cerebrovascular disease, and renal and muscular defects. Alternative splicing results in multiple transcript variants. | COL4A1 | NA |
| ENSG00000107438 | 9124 | PDZ and LIM domain 1 | This gene encodes a member of the enigma protein family. The protein contains two protein interacting domains, a PDZ domain at the amino terminal end and one to three LIM domains at the carboxyl terminal. It is a cytoplasmic protein associated with the cytoskeleton. The protein may function as an adapter to bring other LIM-interacting proteins to the cytoskeleton. Pseudogenes associated with this gene are located on chromosomes 3, 14 and 17. | PDLIM1 | NA |
| ENSG00000148498 | 56288 | par-3 family cell polarity regulator | This gene encodes a member of the PARD protein family. PARD family members interact with other PARD family members and other proteins; they affect asymmetrical cell division and direct polarized cell growth. Multiple alternatively spliced transcript variants have been described for this gene. | PARD3 | NA |
| ENSG00000119938 | 5507 | protein phosphatase 1 regulatory subunit 3C | This gene encodes a regulatory subunit of protein phosphatase-1 (PP1). PP1 catalyzes reversible protein phosphorylation, which is important in a wide range of cellular activities: neuronal, muscular, RNA splicing, protein synthesis, cell death, and glycogen metabolism, to name just a few. By interacting with different regulatory subunits, PP1 is directed to different parts of the cell, to different substrates, or to respond to extracellular signals. | PPP1R3C | NA |
| ENSG00000157404 | 3815 | KIT proto-oncogene receptor tyrosine kinase | This gene encodes the human homolog of the proto-oncogene c-kit. C-kit was first identified as the cellular homolog of the feline sarcoma viral oncogene v-kit. This protein is a type 3 transmembrane receptor for MGF (mast cell growth factor, also known as stem cell factor). Mutations in this gene are associated with gastrointestinal stromal tumors, mast cell disease, acute myelogenous lukemia, and piebaldism. Multiple transcript variants encoding different isoforms have been found for this gene. | KIT | NA |
| ENSG00000119686 | 55640 | feline leukemia virus subgroup C cellular receptor family member 2 | This gene encodes a member of the major facilitator superfamily. The encoded transmembrane protein is a calcium transporter. Unlike the related protein feline leukemia virus subgroup C receptor 1, the protein encoded by this locus does not bind to feline leukemia virus subgroup C envelope protein. The encoded protein may play a role in development of brain vascular endothelial cells, as mutations at this locus have been associated with proliferative vasculopathy and hydranencephaly-hydrocephaly syndrome. Alternatively spliced transcript variants have been described. | FLVCR2 | NA |
| ENSG00000154096 | 7070 | Thy-1 cell surface antigen | This gene encodes a cell surface glycoprotein and member of the immunoglobulin superfamily of proteins. The encoded protein is involved in cell adhesion and cell communication in numerous cell types, but particularly in cells of the immune and nervous systems. The encoded protein is widely used as a marker for hematopoietic stem cells. This gene may function as a tumor suppressor in nasopharyngeal carcinoma. Alternative splicing results in multiple transcript variants. | THY1 | NA |
| ENSG00000268364 | ENSG00000268364 | SMC5 antisense RNA 1 (head to head) | NA | SMC5-AS1 | NA |
| ENSG00000160200 | 875 | cystathionine-beta-synthase | The protein encoded by this gene acts as a homotetramer to catalyze the conversion of homocysteine to cystathionine, the first step in the transsulfuration pathway. The encoded protein is allosterically activated by adenosyl-methionine and uses pyridoxal phosphate as a cofactor. Defects in this gene can cause cystathionine beta-synthase deficiency (CBSD), which can lead to homocystinuria. This gene is a major contributor to cellular hydrogen sulfide production. Multiple alternatively spliced transcript variants have been found for this gene. | CBS | NA |
| ENSG00000110799 | 7450 | von Willebrand factor | This gene encodes a glycoprotein involved in hemostasis. The encoded preproprotein is proteolytically processed following assembly into large multimeric complexes. These complexes function in the adhesion of platelets to sites of vascular injury and the transport of various proteins in the blood. Mutations in this gene result in von Willebrand disease, an inherited bleeding disorder. An unprocessed pseudogene has been found on chromosome 22. | VWF | NA |
| ENSG00000109061 | 4619 | myosin, heavy chain 1, skeletal muscle, adult | Myosin is a major contractile protein which converts chemical energy into mechanical energy through the hydrolysis of ATP. Myosin is a hexameric protein composed of a pair of myosin heavy chains (MYH) and two pairs of nonidentical light chains. Myosin heavy chains are encoded by a multigene family. In mammals at least 10 different myosin heavy chain (MYH) isoforms have been described from striated, smooth, and nonmuscle cells. These isoforms show expression that is spatially and temporally regulated during development. | MYH1 | NA |
| ENSG00000174306 | 23051 | zinc fingers and homeoboxes 3 | This gene encodes a member of the zinc fingers and homeoboxes (ZHX) gene family. The encoded protein contains two C2H2-type zinc fingers and five homeodomains and forms a dimer with itself or with zinc fingers and homeoboxes family member 1. In the nucleus, the dimerized protein interacts with the A subunit of the ubiquitous transcription factor nuclear factor-Y and may function as a transcriptional repressor. | ZHX3 | NA |
| ENSG00000127824 | 7277 | tubulin alpha 4a | Microtubules of the eukaryotic cytoskeleton perform essential and diverse functions and are composed of a heterodimer of alpha and beta tubulin. The genes encoding these microtubule constituents are part of the tubulin superfamily, which is composed of six distinct families. Genes from the alpha, beta and gamma tubulin families are found in all eukaryotes. The alpha and beta tubulins represent the major components of microtubules, while gamma tubulin plays a critical role in the nucleation of microtubule assembly. There are multiple alpha and beta tubulin genes and they are highly conserved among and between species. This gene encodes an alpha tubulin that is a highly conserved homolog of a rat testis-specific alpha tubulin. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | TUBA4A | NA |
| ENSG00000140416 | 7168 | tropomyosin 1 (alpha) | This gene is a member of the tropomyosin family of highly conserved, widely distributed actin-binding proteins involved in the contractile system of striated and smooth muscles and the cytoskeleton of non-muscle cells. Tropomyosin is composed of two alpha-helical chains arranged as a coiled-coil. It is polymerized end to end along the two grooves of actin filaments and provides stability to the filaments. The encoded protein is one type of alpha helical chain that forms the predominant tropomyosin of striated muscle, where it also functions in association with the troponin complex to regulate the calcium-dependent interaction of actin and myosin during muscle contraction. In smooth muscle and non-muscle cells, alternatively spliced transcript variants encoding a range of isoforms have been described. Mutations in this gene are associated with type 3 familial hypertrophic cardiomyopathy. | TPM1 | NA |
| ENSG00000167617 | 148170 | CDC42 effector protein 5 | Cell division control protein 42 (CDC42), a small Rho GTPase, regulates the formation of F-actin-containing structures through its interaction with the downstream effector proteins. The protein encoded by this gene is a member of the Borg (binder of Rho GTPases) family of CDC42 effector proteins. Borg family proteins contain a CRIB (Cdc42/Rac interactive-binding) domain. They bind to CDC42 and regulate its function negatively. The encoded protein may inhibit c-Jun N-terminal kinase (JNK) independently of CDC42 binding. The protein may also play a role in septin organization and inducing pseudopodia formation in fibroblasts | CDC42EP5 | NA |
| ENSG00000111371 | 81539 | solute carrier family 38 member 1 | Amino acid transporters play essential roles in the uptake of nutrients, production of energy, chemical metabolism, detoxification, and neurotransmitter cycling. SLC38A1 is an important transporter of glutamine, an intermediate in the detoxification of ammonia and the production of urea. Glutamine serves as a precursor for the synaptic transmitter, glutamate (Gu et al., 2001 [PubMed 11325958]). | SLC38A1 | NA |
| ENSG00000234175 | ENSG00000234175 | NA | NA | RP11-730A19.9 | NA |
| ENSG00000184113 | 7122 | claudin 5 | This gene encodes a member of the claudin family. Claudins are integral membrane proteins and components of tight junction strands. Tight junction strands serve as a physical barrier to prevent solutes and water from passing freely through the paracellular space between epithelial or endothelial cell sheets. Mutations in this gene have been found in patients with velocardiofacial syndrome. Alternatively spliced transcript variants encoding the same protein have been found for this gene. | CLDN5 | NA |
| ENSG00000267328 | ENSG00000267328 | NA | NA | AC002398.12 | NA |
| ENSG00000168484 | 6440 | surfactant protein C | This gene encodes the pulmonary-associated surfactant protein C (SPC), an extremely hydrophobic surfactant protein essential for lung function and homeostasis after birth. Pulmonary surfactant is a surface-active lipoprotein complex composed of 90% lipids and 10% proteins which include plasma proteins and apolipoproteins SPA, SPB, SPC and SPD. The surfactant is secreted by the alveolar cells of the lung and maintains the stability of pulmonary tissue by reducing the surface tension of fluids that coat the lung. Multiple mutations in this gene have been identified, which cause pulmonary surfactant metabolism dysfunction type 2, also called pulmonary alveolar proteinosis due to surfactant protein C deficiency, and are associated with interstitial lung disease in older infants, children, and adults. Alternatively spliced transcript variants encoding different protein isoforms have been identified. | SFTPC | NA |
| ENSG00000107130 | 23413 | neuronal calcium sensor 1 | This gene is a member of the neuronal calcium sensor gene family, which encode calcium-binding proteins expressed predominantly in neurons. The protein encoded by this gene regulates G protein-coupled receptor phosphorylation in a calcium-dependent manner and can substitute for calmodulin. The protein is associated with secretory granules and modulates synaptic transmission and synaptic plasticity. Multiple transcript variants encoding different isoforms have been found for this gene. | NCS1 | NA |
| ENSG00000257410 | ENSG00000257410 | NA | NA | RP11-2H8.2 | NA |
| ENSG00000244274 | 55861 | dysbindin domain containing 2 | NA | DBNDD2 | NA |
| ENSG00000198668 | 801 | calmodulin 1 (phosphorylase kinase, delta) | This gene encodes a member of the EF-hand calcium-binding protein family. It is one of three genes which encode an identical calcium binding protein which is one of the four subunits of phosphorylase kinase. Two pseudogenes have been identified on chromosome 7 and X. Multiple transcript variants encoding different isoforms have been found for this gene. | CALM1 | NA |
| ENSG00000198668 | 805 | calmodulin 2 (phosphorylase kinase, delta) | This gene is a member of the calmodulin gene family. There are three distinct calmodulin genes dispersed throughout the genome that encode the identical protein, but differ at the nucleotide level. Calmodulin is a calcium binding protein that plays a role in signaling pathways, cell cycle progression and proliferation. Several infants with severe forms of long-QT syndrome (LQTS) who displayed life-threatening ventricular arrhythmias together with delayed neurodevelopment and epilepsy were found to have mutations in either this gene or another member of the calmodulin gene family (PMID:23388215). Mutations in this gene have also been identified in patients with less severe forms of LQTS (PMID:24917665), while mutations in another calmodulin gene family member have been associated with catecholaminergic polymorphic ventricular tachycardia (CPVT)(PMID:23040497), a rare disorder thought to be the cause of a significant fraction of sudden cardiac deaths in young individuals. Pseudogenes of this gene are found on chromosomes 10, 13, and 17. Alternative splicing results in multiple transcript variants encoding different isoforms. | CALM2 | NA |
| ENSG00000100784 | 9252 | ribosomal protein S6 kinase A5 | NA | RPS6KA5 | NA |
| ENSG00000266101 | ENSG00000266101 | NA | NA | RP5-906A24.2 | NA |
| ENSG00000213165 | NA | NA | NA | NA | TRUE |
| ENSG00000163702 | 84818 | interleukin 17 receptor C | This gene encodes a single-pass type I membrane protein that shares similarity with the interleukin-17 receptor (IL-17RA). Unlike IL-17RA, which is predominantly expressed in hemopoietic cells, and binds with high affinity to only IL-17A, this protein is expressed in nonhemopoietic tissues, and binds both IL-17A and IL-17F with similar affinities. The proinflammatory cytokines, IL-17A and IL-17F, have been implicated in the progression of inflammatory and autoimmune diseases. Multiple alternatively spliced transcript variants encoding different isoforms have been detected for this gene, and it has been proposed that soluble, secreted proteins lacking transmembrane and intracellular domains may function as extracellular antagonists to cytokine signaling. | IL17RC | NA |
| ENSG00000097007 | 25 | ABL proto-oncogene 1, non-receptor tyrosine kinase | This gene is a protooncogene that encodes a protein tyrosine kinase involved in a variety of cellular processes, including cell division, adhesion, differentiation, and response to stress. The activity of the protein is negatively regulated by its SH3 domain, whereby deletion of the region encoding this domain results in an oncogene. The ubiquitously expressed protein has DNA-binding activity that is regulated by CDC2-mediated phosphorylation, suggesting a cell cycle function. This gene has been found fused to a variety of translocation partner genes in various leukemias, most notably the t(9;22) translocation that results in a fusion with the 5’ end of the breakpoint cluster region gene (BCR; MIM:151410). Alternative splicing of this gene results in two transcript variants, which contain alternative first exons that are spliced to the remaining common exons. | ABL1 | NA |
| ENSG00000183092 | 57596 | brain enriched guanylate kinase associated | NA | BEGAIN | NA |
| ENSG00000111907 | 7164 | tumor protein D52-like 1 | This gene encodes a member of a family of proteins that contain coiled-coil domains and may form hetero- or homomers. The encoded protein is involved in cell proliferation and calcium signaling. It also interacts with the mitogen-activated protein kinase kinase kinase 5 (MAP3K5/ASK1) and positively regulates MAP3K5-induced apoptosis. Multiple alternatively spliced transcript variants have been observed. | TPD52L1 | NA |
| ENSG00000118785 | 6696 | secreted phosphoprotein 1 | The protein encoded by this gene is involved in the attachment of osteoclasts to the mineralized bone matrix. The encoded protein is secreted and binds hydroxyapatite with high affinity. The osteoclast vitronectin receptor is found in the cell membrane and may be involved in the binding to this protein. This protein is also a cytokine that upregulates expression of interferon-gamma and interleukin-12. Several transcript variants encoding different isoforms have been found for this gene. | SPP1 | NA |
| ENSG00000183044 | 18 | 4-aminobutyrate aminotransferase | 4-aminobutyrate aminotransferase (ABAT) is responsible for catabolism of gamma-aminobutyric acid (GABA), an important, mostly inhibitory neurotransmitter in the central nervous system, into succinic semialdehyde. The active enzyme is a homodimer of 50-kD subunits complexed to pyridoxal-5-phosphate. The protein sequence is over 95% similar to the pig protein. GABA is estimated to be present in nearly one-third of human synapses. ABAT in liver and brain is controlled by 2 codominant alleles with a frequency in a Caucasian population of 0.56 and 0.44. The ABAT deficiency phenotype includes psychomotor retardation, hypotonia, hyperreflexia, lethargy, refractory seizures, and EEG abnormalities. Multiple alternatively spliced transcript variants encoding the same protein isoform have been found for this gene. | ABAT | NA |
| ENSG00000170667 | 100271927 | RAS p21 protein activator 4B | NA | RASA4B | NA |
| ENSG00000255112 | 57132 | charged multivesicular body protein 1B | CHMP1B belongs to the chromatin-modifying protein/charged multivesicular body protein (CHMP) family. These proteins are components of ESCRT-III (endosomal sorting complex required for transport III), a complex involved in degradation of surface receptor proteins and formation of endocytic multivesicular bodies (MVBs). Some CHMPs have both nuclear and cytoplasmic/vesicular distributions, and one such CHMP, CHMP1A (MIM 164010), is required for both MVB formation and regulation of cell cycle progression (Tsang et al., 2006 [PubMed 16730941]). | CHMP1B | NA |
| ENSG00000004776 | 126393 | heat shock protein family B (small) member 6 | This locus encodes a heat shock protein. The encoded protein likely plays a role in smooth muscle relaxation. | HSPB6 | NA |
| ENSG00000131242 | 84440 | RAB11 family interacting protein 4 | Proteins of the large Rab GTPase family (see RAB1A; MIM 179508) have regulatory roles in the formation, targeting, and fusion of intracellular transport vesicles. RAB11FIP4 is one of many proteins that interact with and regulate Rab GTPases (Hales et al., 2001 [PubMed 11495908]). | RAB11FIP4 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",17,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[18,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
kable(as.data.frame(out))
| symbol | X_id | query | name | summary |
|---|---|---|---|---|
| PRSS27 | 83886 | ENSG00000172382 | protease, serine 27 | This gene is located within a large protease gene cluster on chromosome 16. It belongs to the group-1 subfamily of serine proteases. The encoded protein is a secreted tryptic serine protease and is expressed mainly in the pancreas. Alternative splicing results in multiple transcript variants. |
| MAL | 4118 | ENSG00000172005 | mal, T-cell differentiation protein | The protein encoded by this gene is a highly hydrophobic integral membrane protein belonging to the MAL family of proteolipids. The protein has been localized to the endoplasmic reticulum of T-cells and is a candidate linker protein in T-cell signal transduction. In addition, this proteolipid is localized in compact myelin of cells in the nervous system and has been implicated in myelin biogenesis and/or function. The protein plays a role in the formation, stabilization and maintenance of glycosphingolipid-enriched membrane microdomains. Down-regulation of this gene has been associated with a variety of human epithelial malignancies. Alternative splicing produces four transcript variants which vary from each other by the presence or absence of alternatively spliced exons 2 and 3. |
| TRIM29 | 23650 | ENSG00000137699 | tripartite motif containing 29 | The protein encoded by this gene belongs to the TRIM protein family. It has multiple zinc finger motifs and a leucine zipper motif. It has been proposed to form homo- or heterodimers which are involved in nucleic acid binding. Thus, it may act as a transcriptional regulatory factor involved in carcinogenesis and/or differentiation. It may also function in the suppression of radiosensitivity since it is associated with ataxia telangiectasia phenotype. |
| TGM1 | 7051 | ENSG00000092295 | transglutaminase 1 | The protein encoded by this gene is a membrane protein that catalyzes the addition of an alkyl group from an akylamine to a glutamine residue of a protein, forming an alkylglutamine in the protein. This protein alkylation leads to crosslinking of proteins and catenation of polyamines to proteins. This gene contains either one or two copies of a 22 nt repeat unit in its 3’ UTR. Mutations in this gene have been associated with autosomal recessive lamellar ichthyosis (LI) and nonbullous congenital ichthyosiform erythroderma (NCIE). |
| S100A2 | 6273 | ENSG00000196754 | S100 calcium binding protein A2 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may have a tumor suppressor function. Chromosomal rearrangements and altered expression of this gene have been implicated in breast cancer. |
| GJB2 | 2706 | ENSG00000165474 | gap junction protein beta 2 | This gene encodes a member of the gap junction protein family. The gap junctions were first characterized by electron microscopy as regionally specialized structures on plasma membranes of contacting adherent cells. These structures were shown to consist of cell-to-cell channels that facilitate the transfer of ions and small molecules between cells. The gap junction proteins, also known as connexins, purified from fractions of enriched gap junctions from different tissues differ. According to sequence similarities at the nucleotide and amino acid levels, the gap junction proteins are divided into two categories, alpha and beta. Mutations in this gene are responsible for as much as 50% of pre-lingual, recessive deafness. |
| PTK6 | 5753 | ENSG00000101213 | protein tyrosine kinase 6 | The protein encoded by this gene is a cytoplasmic nonreceptor protein kinase which may function as an intracellular signal transducer in epithelial tissues. Overexpression of this gene in mammary epithelial cells leads to sensitization of the cells to epidermal growth factor and results in a partially transformed phenotype. Expression of this gene has been detected at low levels in some breast tumors but not in normal breast tissue. The encoded protein has been shown to undergo autophosphorylation. Alternative splicing results in multiple transcript variants. |
| SPINK5 | 11005 | ENSG00000133710 | serine peptidase inhibitor, Kazal type 5 | This gene encodes a multidomain serine protease inhibitor that contains 15 potential inhibitory domains. The encoded preproprotein is proteolytically processed to generate multiple protein products, which may exhibit unique activities and specificities. These proteins may play a role in skin and hair morphogenesis, as well as anti-inflammatory and antimicrobial protection of mucous epithelia. Mutations in this gene may result in Netherton syndrome, a disorder characterized by ichthyosis, defective cornification, and atopy. This gene is present in a gene cluster on chromosome 5. Alternative splicing results in multiple transcript variants. |
| VSIG10L | 147645 | ENSG00000186806 | V-set and immunoglobulin domain containing 10 like | NA |
| LYPD3 | 27076 | ENSG00000124466 | LY6/PLAUR domain containing 3 | NA |
| TTC9 | 23508 | ENSG00000133985 | tetratricopeptide repeat domain 9 | This gene encodes a protein that contains three tetratricopeptide repeats. The gene has been shown to be hormonally regulated in breast cancer cells and may play a role in cancer cell invasion and metastasis. |
| CRABP2 | 1382 | ENSG00000143320 | cellular retinoic acid binding protein 2 | This gene encodes a member of the retinoic acid (RA, a form of vitamin A) binding protein family and lipocalin/cytosolic fatty-acid binding protein family. The protein is a cytosol-to-nuclear shuttling protein, which facilitates RA binding to its cognate receptor complex and transfer to the nucleus. It is involved in the retinoid signaling pathway, and is associated with increased circulating low-density lipoprotein cholesterol. Alternatively spliced transcript variants encoding the same protein have been found for this gene. |
| CTC-251D13.1 | ENSG00000271795 | ENSG00000271795 | NA | NA |
| CYSRT1 | 375791 | ENSG00000197191 | cysteine rich tail 1 | NA |
| DEGS2 | 123099 | ENSG00000168350 | delta(4)-desaturase, sphingolipid 2 | This gene encodes a bifunctional enzyme that is involved in the biosynthesis of phytosphingolipids in human skin and in other phytosphingolipid-containing tissues. This enzyme can act as a sphingolipid delta(4)-desaturase, and also as a sphingolipid C4-hydroxylase. |
| CSTB | 1476 | ENSG00000160213 | cystatin B | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins and kininogens. This gene encodes a stefin that functions as an intracellular thiol protease inhibitor. The protein is able to form a dimer stabilized by noncovalent forces, inhibiting papain and cathepsins l, h and b. The protein is thought to play a role in protecting against the proteases leaking from lysosomes. Evidence indicates that mutations in this gene are responsible for the primary defects in patients with progressive myoclonic epilepsy (EPM1). |
| KRT19 | 3880 | ENSG00000171345 | keratin 19 | The protein encoded by this gene is a member of the keratin family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. The type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. Unlike its related family members, this smallest known acidic cytokeratin is not paired with a basic cytokeratin in epithelial cells. It is specifically expressed in the periderm, the transiently superficial layer that envelopes the developing epidermis. The type I cytokeratins are clustered in a region of chromosome 17q12-q21. |
| PSCA | 8000 | ENSG00000167653 | prostate stem cell antigen | This gene encodes a glycosylphosphatidylinositol-anchored cell membrane glycoprotein. In addition to being highly expressed in the prostate it is also expressed in the bladder, placenta, colon, kidney, and stomach. This gene is up-regulated in a large proportion of prostate cancers and is also detected in cancers of the bladder and pancreas. This gene includes a polymorphism that results in an upstream start codon in some individuals; this polymorphism is thought to be associated with a risk for certain gastric and bladder cancers. Alternative splicing results in multiple transcript variants. |
| TIAM1 | 7074 | ENSG00000156299 | T-cell lymphoma invasion and metastasis 1 | NA |
| AIF1L | 83543 | ENSG00000126878 | allograft inflammatory factor 1 like | NA |
| ECM1 | 1893 | ENSG00000143369 | extracellular matrix protein 1 | This gene encodes a soluble protein that is involved in endochondral bone formation, angiogenesis, and tumor biology. It also interacts with a variety of extracellular and structural proteins, contributing to the maintenance of skin integrity and homeostasis. Mutations in this gene are associated with lipoid proteinosis disorder (also known as hyalinosis cutis et mucosae or Urbach-Wiethe disease) that is characterized by generalized thickening of skin, mucosae and certain viscera. Alternatively spliced transcript variants encoding distinct isoforms have been described for this gene. |
| S100A8 | 6279 | ENSG00000143546 | S100 calcium binding protein A8 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and as a cytokine. Altered expression of this protein is associated with the disease cystic fibrosis. Multiple transcript variants encoding different isoforms have been found for this gene. |
| ALDH3A1 | 218 | ENSG00000108602 | aldehyde dehydrogenase 3 family member A1 | Aldehyde dehydrogenases oxidize various aldehydes to the corresponding acids. They are involved in the detoxification of alcohol-derived acetaldehyde and in the metabolism of corticosteroids, biogenic amines, neurotransmitters, and lipid peroxidation. The enzyme encoded by this gene forms a cytoplasmic homodimer that preferentially oxidizes aromatic and medium-chain (6 carbons or more) saturated and unsaturated aldehyde substrates. It is thought to promote resistance to UV and 4-hydroxy-2-nonenal-induced oxidative damage in the cornea. The gene is located within the Smith-Magenis syndrome region on chromosome 17. Multiple alternatively spliced variants, encoding the same protein, have been identified. |
| S100A14 | 57402 | ENSG00000189334 | S100 calcium binding protein A14 | This gene encodes a member of the S100 protein family which contains an EF-hand motif and binds calcium. The gene is located in a cluster of S100 genes on chromosome 1. Levels of the encoded protein have been found to be lower in cancerous tissue and associated with metastasis suggesting a tumor suppressor function (PMID: 19956863, 19351828). |
| CNFN | 84518 | ENSG00000105427 | cornifelin | NA |
| MREG | 55686 | ENSG00000118242 | melanoregulin | NA |
| RP11-20D14.6 | ENSG00000249790 | ENSG00000249790 | NA | NA |
| GEM | 2669 | ENSG00000164949 | GTP binding protein overexpressed in skeletal muscle | The protein encoded by this gene belongs to the RAD/GEM family of GTP-binding proteins. It is associated with the inner face of the plasma membrane and could play a role as a regulatory protein in receptor-mediated signal transduction. Alternative splicing occurs at this locus and two transcript variants encoding the same protein have been identified. |
| SAA1 | 6288 | ENSG00000173432 | serum amyloid A1 | This gene encodes a member of the serum amyloid A family of apolipoproteins. The encoded preproprotein is proteolytically processed to generate the mature protein. This protein is a major acute phase protein that is highly expressed in response to inflammation and tissue injury. This protein also plays an important role in HDL metabolism and cholesterol homeostasis. High levels of this protein are associated with chronic inflammatory diseases including atherosclerosis, rheumatoid arthritis, Alzheimer’s disease and Crohn’s disease. This protein may also be a potential biomarker for certain tumors. Alternate splicing results in multiple transcript variants that encode the same protein. A pseudogene of this gene is found on chromosome 11. |
| DUOX1 | 53905 | ENSG00000137857 | dual oxidase 1 | The protein encoded by this gene is a glycoprotein and a member of the NADPH oxidase family. The synthesis of thyroid hormone is catalyzed by a protein complex located at the apical membrane of thyroid follicular cells. This complex contains an iodide transporter, thyroperoxidase, and a peroxide generating system that includes proteins encoded by this gene and the similar DUOX2 gene. This protein is known as dual oxidase because it has both a peroxidase homology domain and a gp91phox domain. This protein generates hydrogen peroxide and thereby plays a role in the activity of thyroid peroxidase, lactoperoxidase, and in lactoperoxidase-mediated antimicrobial defense at mucosal surfaces. Two alternatively spliced transcript variants encoding the same protein have been described for this gene. |
| PPL | 5493 | ENSG00000118898 | periplakin | The protein encoded by this gene is a component of desmosomes and of the epidermal cornified envelope in keratinocytes. The N-terminal domain of this protein interacts with the plasma membrane and its C-terminus interacts with intermediate filaments. Through its rod domain, this protein forms complexes with envoplakin. This protein may serve as a link between the cornified envelope and desmosomes as well as intermediate filaments. AKT1/PKB, a protein kinase mediating a variety of cell growth and survival signaling processes, is reported to interact with this protein, suggesting a possible role for this protein as a localization signal in AKT1-mediated signaling. |
| GRHL1 | 29841 | ENSG00000134317 | grainyhead like transcription factor 1 | This gene encodes a member of the grainyhead family of transcription factors. The encoded protein can exist as a homodimer or can form heterodimers with sister-of-mammalian grainyhead or brother-of-mammalian grainyhead. This protein functions as a transcription factor during development. |
| RP11-67L3.5 | ENSG00000242396 | ENSG00000242396 | NA | NA |
| IL20RB | 53833 | ENSG00000174564 | interleukin 20 receptor subunit beta | IL20RB and IL20RA (MIM 605620) form a heterodimeric receptor for interleukin-20 (IL20; MIM 605619) (Blumberg et al., 2001 [PubMed 11163236]). |
| SULT2B1 | 6820 | ENSG00000088002 | sulfotransferase family 2B member 1 | Sulfotransferase enzymes catalyze the sulfate conjugation of many hormones, neurotransmitters, drugs, and xenobiotic compounds. These cytosolic enzymes are different in their tissue distributions and substrate specificities. The gene structure (number and length of exons) is similar among family members. This gene sulfates dehydroepiandrosterone but not 4-nitrophenol, a typical substrate for the phenol and estrogen sulfotransferase subfamilies. Two alternatively spliced variants that encode different isoforms have been described. |
| CSTA | 1475 | ENSG00000121552 | cystatin A | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins, and kininogens. This gene encodes a stefin that functions as a cysteine protease inhibitor, forming tight complexes with papain and the cathepsins B, H, and L. The protein is one of the precursor proteins of cornified cell envelope in keratinocytes and plays a role in epidermal development and maintenance. Stefins have been proposed as prognostic and diagnostic tools for cancer. |
| PHACTR1 | 221692 | ENSG00000112137 | phosphatase and actin regulator 1 | The protein encoded by this gene is a member of the phosphatase and actin regulator family of proteins. This family member can bind actin and regulate the reorganization of the actin cytoskeleton. It plays a role in tubule formation and in endothelial cell survival. Polymorphisms in this gene are associated with susceptibility to myocardial infarction, coronary artery disease and cervical artery dissection. Alternative splicing of this gene results in multiple transcript variants. |
| TINCR | 257000 | ENSG00000223573 | tissue differentiation-inducing non-protein coding RNA | This gene produces a spliced long non-coding RNA that is required for normal epidermal differentiation. This transcript regulates the expression of genes involved in the differentiation of epidermal tissue. Mutations in some of the genes targeted by this transcript have been implicated in epidermal skin diseases. |
| HBA2 | 3040 | ENSG00000188536 | hemoglobin subunit alpha 2 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. |
| ANXA1 | 301 | ENSG00000135046 | annexin A1 | This gene encodes a membrane-localized protein that binds phospholipids. This protein inhibits phospholipase A2 and has anti-inflammatory activity. Loss of function or expression of this gene has been detected in multiple tumors. |
| CREB3L1 | 90993 | ENSG00000157613 | cAMP responsive element binding protein 3 like 1 | The protein encoded by this gene is normally found in the membrane of the endoplasmic reticulum (ER). However, upon stress to the ER, the encoded protein is cleaved and the released cytoplasmic transcription factor domain translocates to the nucleus. There it activates the transcription of target genes by binding to box-B elements. |
| CCDC151 | 115948 | ENSG00000198003 | coiled-coil domain containing 151 | This gene encodes a protein containing coiled-coil domains. The encoded protein functions in outer dynein arm assembly and is required for motile cilia function. Mutations in this gene result in primary ciliary dyskinesia. Alternative splicing results in multiple transcript variants encoding different isoforms. |
| GNA15 | 2769 | ENSG00000060558 | G protein subunit alpha 15 | NA |
| CORO2A | 7464 | ENSG00000106789 | coronin 2A | This gene encodes a member of the WD repeat protein family. WD repeats are minimally conserved regions of approximately 40 amino acids typically bracketed by gly-his and trp-asp (GH-WD), which may facilitate formation of heterotrimeric or multiprotein complexes. Members of this family are involved in a variety of cellular processes, including cell cycle progression, signal transduction, apoptosis, and gene regulation. This protein contains 5 WD repeats, and has a structural similarity with actin-binding proteins: the D. discoideum coronin and the human p57 protein, suggesting that this protein may also be an actin-binding protein that regulates cell motility. Alternative splicing of this gene generates 2 transcript variants. |
| BNIPL | 149428 | ENSG00000163141 | BCL2/adenovirus E1B 19kD interacting protein like | The protein encoded by this gene interacts with several other proteins, such as BCL2, ARHGAP1, MIF and GFER. It may function as a bridge molecule between BCL2 and ARHGAP1/CDC42 in promoting cell death. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. |
| SLC16A9 | 220963 | ENSG00000165449 | solute carrier family 16 member 9 | NA |
| FUT2 | 2524 | ENSG00000176920 | fucosyltransferase 2 | The protein encoded by this gene is a Golgi stack membrane protein that is involved in the creation of a precursor of the H antigen, which is required for the final step in the soluble A and B antigen synthesis pathway. This gene is one of two encoding the galactoside 2-L-fucosyltransferase enzyme. Two transcript variants encoding the same protein have been found for this gene. |
| HBA1 | 3039 | ENSG00000206172 | hemoglobin subunit alpha 1 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. |
| CRP | 1401 | ENSG00000132693 | C-reactive protein, pentraxin-related | The protein encoded by this gene belongs to the pentaxin family. It is involved in several host defense related functions based on its ability to recognize foreign pathogens and damaged cells of the host and to initiate their elimination by interacting with humoral and cellular effector systems in the blood. Consequently, the level of this protein in plasma increases greatly during acute phase response to tissue injury, infection, or other inflammatory stimuli. |
| LY6D | 8581 | ENSG00000167656 | lymphocyte antigen 6 complex, locus D | NA |
| S100A16 | 140576 | ENSG00000188643 | S100 calcium binding protein A16 | NA |
| ATF7IP2 | 80063 | ENSG00000166669 | activating transcription factor 7 interacting protein 2 | NA |
| HAS3 | 3038 | ENSG00000103044 | hyaluronan synthase 3 | The protein encoded by this gene is involved in the synthesis of the unbranched glycosaminoglycan hyaluronan, or hyaluronic acid, which is a major constituent of the extracellular matrix. This gene is a member of the NODC/HAS gene family. Compared to the proteins encoded by other members of this gene family, this protein appears to be more of a regulator of hyaluronan synthesis. Alternative splicing results in multiple transcript variants. |
| AC004951.5 | ENSG00000239556 | ENSG00000239556 | NA | NA |
| ALDH1A3 | 220 | ENSG00000184254 | aldehyde dehydrogenase 1 family member A3 | This gene encodes an aldehyde dehydrogenase enzyme that uses retinal as a substrate. Mutations in this gene have been associated with microphthalmia, isolated 8, and expression changes have also been detected in tumor cells. Alternative splicing results in multiple transcript variants. |
| ALOX12 | 239 | ENSG00000108839 | arachidonate 12-lipoxygenase, 12S type | NA |
| ATG9B | 285973 | ENSG00000181652 | autophagy related 9B | This gene functions in the regulation of autophagy, a lysosomal degradation pathway. This gene also functions as an antisense transcript in the posttranscriptional regulation of the endothelial nitric oxide synthase 3 gene, which has 3’ overlap with this gene on the opposite strand. Mutations in this gene and disruption of the autophagy process have been associated with multiple cancers. Alternative splicing results in multiple transcript variants. |
| ANKRD65 | 441869 | ENSG00000235098 | ankyrin repeat domain 65 | NA |
| GBP1P1 | 400759 | ENSG00000225492 | guanylate binding protein 1 pseudogene 1 | NA |
| CDKN2B | 1030 | ENSG00000147883 | cyclin-dependent kinase inhibitor 2B | This gene lies adjacent to the tumor suppressor gene CDKN2A in a region that is frequently mutated and deleted in a wide variety of tumors. This gene encodes a cyclin-dependent kinase inhibitor, which forms a complex with CDK4 or CDK6, and prevents the activation of the CDK kinases, thus the encoded protein functions as a cell growth regulator that controls cell cycle G1 progression. The expression of this gene was found to be dramatically induced by TGF beta, which suggested its role in the TGF beta induced growth inhibition. Two alternatively spliced transcript variants of this gene, which encode distinct proteins, have been reported. |
| RP11-798K23.5 | ENSG00000253520 | ENSG00000253520 | NA | NA |
| SLC16A6 | 9120 | ENSG00000108932 | solute carrier family 16 member 6 | NA |
| HPDL | 84842 | ENSG00000186603 | 4-hydroxyphenylpyruvate dioxygenase like | NA |
| CCND2-AS1 | 103752584 | ENSG00000256164 | CCND2 antisense RNA 1 | NA |
| RAPGEFL1 | 51195 | ENSG00000108352 | Rap guanine nucleotide exchange factor like 1 | NA |
| CTD-2201G16.1 | ENSG00000258444 | ENSG00000258444 | NA | NA |
| TMEM79 | 84283 | ENSG00000163472 | transmembrane protein 79 | NA |
| N4BP3 | 23138 | ENSG00000145911 | NEDD4 binding protein 3 | NA |
| ARG2 | 384 | ENSG00000081181 | arginase 2 | Arginase catalyzes the hydrolysis of arginine to ornithine and urea. At least two isoforms of mammalian arginase exists (types I and II) which differ in their tissue distribution, subcellular localization, immunologic crossreactivity and physiologic function. The type II isoform encoded by this gene, is located in the mitochondria and expressed in extra-hepatic tissues, especially kidney. The physiologic role of this isoform is poorly understood; it is thought to play a role in nitric oxide and polyamine metabolism. Transcript variants of the type II gene resulting from the use of alternative polyadenylation sites have been described. |
| MUC20P1 | ENSG00000224769 | ENSG00000224769 | mucin 20, cell surface associated pseudogene 1 | NA |
| MYH7 | 4625 | ENSG00000092054 | myosin, heavy chain 7, cardiac muscle, beta | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. |
| CCDC114 | 93233 | ENSG00000105479 | coiled-coil domain containing 114 | This gene encodes a coiled-coil domain-containing protein that is a component of the outer dynein arm docking complex in cilia cells. Mutations in this gene may cause primary ciliary dyskinesia 20. |
| RHOV | 171177 | ENSG00000104140 | ras homolog family member V | NA |
| SH3BGR | 6450 | ENSG00000185437 | SH3 domain binding glutamate rich protein | NA |
| RP11-316O14.1 | ENSG00000268603 | ENSG00000268603 | NA | NA |
| HBB | 3043 | ENSG00000244734 | hemoglobin subunit beta | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. |
| SLC9A3 | 6550 | ENSG00000066230 | solute carrier family 9 member A3 | The protein encoded by this gene is an epithelial brush border Na/H exchanger that uses an inward sodium ion gradient to expel acids from the cell. Defects in this gene are a cause of congenital secretory sodium diarrhea. Pseudogenes of this gene exist on chromosomes 10 and 22. |
| FAM117A | 81558 | ENSG00000121104 | family with sequence similarity 117 member A | NA |
| SNRPN | 6638 | ENSG00000128739 | small nuclear ribonucleoprotein polypeptide N | The protein encoded by this gene is one polypeptide of a small nuclear ribonucleoprotein complex and belongs to the snRNP SMB/SMN family. The protein plays a role in pre-mRNA processing, possibly tissue-specific alternative splicing events. Although individual snRNPs are believed to recognize specific nucleic acid sequences through RNA-RNA base pairing, the specific role of this family member is unknown. The protein arises from a bicistronic transcript that also encodes a protein identified as the SNRPN upstream reading frame (SNURF). Multiple transcription initiation sites have been identified and extensive alternative splicing occurs in the 5’ untranslated region. Additional splice variants have been described but sequences for the complete transcripts have not been determined. The 5’ UTR of this gene has been identified as an imprinting center. Alternative splicing or deletion caused by a translocation event in this paternally-expressed region is responsible for Angelman syndrome or Prader-Willi syndrome due to parental imprint switch failure. |
| SLC45A4 | 57210 | ENSG00000022567 | solute carrier family 45 member 4 | NA |
| DSC2 | 1824 | ENSG00000134755 | desmocollin 2 | This gene encodes a member of the desmocollin protein subfamily. Desmocollins, along with desmogleins, are cadherin-like transmembrane glycoproteins that are major components of the desmosome. Desmosomes are cell-cell junctions that help resist shearing forces and are found in high concentrations in cells subject to mechanical stress. This gene is found in a cluster with other desmocollin family members on chromosome 18. Mutations in this gene are associated with arrhythmogenic right ventricular dysplasia-11, and reduced protein expression has been described in several types of cancer. Alternative splicing results in multiple transcript variants. |
| TTC25 | 83538 | ENSG00000204815 | tetratricopeptide repeat domain 25 | NA |
| CHPF | 79586 | ENSG00000123989 | chondroitin polymerizing factor | NA |
| TNNI3 | 7137 | ENSG00000129991 | troponin I3, cardiac type | Troponin I (TnI), along with troponin T (TnT) and troponin C (TnC), is one of 3 subunits that form the troponin complex of the thin filaments of striated muscle. TnI is the inhibitory subunit; blocking actin-myosin interactions and thereby mediating striated muscle relaxation. The TnI subfamily contains three genes: TnI-skeletal-fast-twitch, TnI-skeletal-slow-twitch, and TnI-cardiac. This gene encodes the TnI-cardiac protein and is exclusively expressed in cardiac muscle tissues. Mutations in this gene cause familial hypertrophic cardiomyopathy type 7 (CMH7) and familial restrictive cardiomyopathy (RCM). |
| LMF1 | 64788 | ENSG00000260807 | lipase maturation factor 1 | The protein encoded by this gene resides in the endoplasmic reticulum, and is involved in the maturation and transport of lipoprotein lipase through the secretory pathway. Mutations in this gene are associated with combined lipase deficiency. Alternatively spliced transcript variants have been found for this gene. |
| DBNDD1 | 79007 | ENSG00000003249 | dysbindin (dystrobrevin binding protein 1) domain containing 1 | NA |
| FGA | 2243 | ENSG00000171560 | fibrinogen alpha chain | This gene encodes the alpha subunit of the coagulation factor fibrinogen, which is a component of the blood clot. Following vascular injury, the encoded preproprotein is proteolytically processed by thrombin during the conversion of fibrinogen to fibrin. Mutations in this gene lead to several disorders, including dysfibrinogenemia, hypofibrinogenemia, afibrinogenemia and renal amyloidosis. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. |
| SOX15 | 6665 | ENSG00000129194 | SRY-box 15 | This gene encodes a member of the SOX (SRY-related HMG-box) family of transcription factors involved in the regulation of embryonic development and in the determination of the cell fate. The encoded protein may act as a transcriptional regulator after forming a protein complex with other proteins. |
| AHNAK2 | 113146 | ENSG00000185567 | AHNAK nucleoprotein 2 | NA |
| MTND6P4 | ENSG00000249119 | ENSG00000249119 | mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 6 pseudogene 4 | NA |
| BOK | 666 | ENSG00000176720 | BCL2-related ovarian killer | The protein encoded by this gene belongs to the BCL2 family, members of which form homo- or heterodimers, and act as anti- or proapoptotic regulators that are involved in a wide variety of cellular processes. Studies in rat show that this protein has restricted expression in reproductive tissues, interacts strongly with some antiapoptotic BCL2 proteins, not at all with proapoptotic BCL2 proteins, and induces apoptosis in transfected cells. Thus, this protein represents a proapoptotic member of the BCL2 family. |
| FABP5P7 | ENSG00000234964 | ENSG00000234964 | fatty acid binding protein 5 pseudogene 7 | NA |
| COL4A4 | 1286 | ENSG00000081052 | collagen type IV alpha 4 chain | This gene encodes one of the six subunits of type IV collagen, the major structural component of basement membranes. This particular collagen IV subunit, however, is only found in a subset of basement membranes. Like the other members of the type IV collagen gene family, this gene is organized in a head-to-head conformation with another type IV collagen gene so that each gene pair shares a common promoter. Mutations in this gene are associated with type II autosomal recessive Alport syndrome (hereditary glomerulonephropathy) and with familial benign hematuria (thin basement membrane disease). Two transcripts, differing only in their transcription start sites, have been identified for this gene and, as is common for collagen genes, multiple polyadenylation sites are found in the 3’ UTR. |
| PHYHIP | 9796 | ENSG00000168490 | phytanoyl-CoA 2-hydroxylase interacting protein | NA |
| RP11-732A19.5 | ENSG00000255390 | ENSG00000255390 | NA | NA |
| APOL4 | 80832 | ENSG00000100336 | apolipoprotein L4 | The protein encoded by this gene is a member of the apolipoprotein L family and may play a role in lipid exchange and transport throughout the body, as well as in reverse cholesterol transport from peripheral cells to the liver. Two transcript variants encoding two different isoforms have been found for this gene. Only one of the isoforms appears to be a secreted protein. |
| MMP23B | 8510 | ENSG00000189409 | matrix metallopeptidase 23B | This gene (MMP23B) encodes a member of the matrix metalloproteinase (MMP) family, and it is part of a duplicated region of chromosome 1p36.3. Proteins of the matrix metalloproteinase (MMP) family are involved in the breakdown of extracellular matrix in normal physiological processes, such as embryonic development, reproduction, and tissue remodeling, as well as in disease processes, such as arthritis and metastasis. This gene belongs to the more telomeric copy of the duplicated region. |
| OXCT2P1 | ENSG00000237624 | ENSG00000237624 | 3-oxoacid CoA-transferase 2 pseudogene 1 | NA |
| DCHS1 | 8642 | ENSG00000166341 | dachsous cadherin-related 1 | This gene is a member of the cadherin superfamily whose members encode calcium-dependent cell-cell adhesion molecules. The encoded protein has a signal peptide, 27 cadherin repeat domains and a unique cytoplasmic region. This particular cadherin family member is expressed in fibroblasts but not in melanocytes or keratinocytes. The cell-cell adhesion of fibroblasts is thought to be necessary for wound healing. |
| MXD1 | 4084 | ENSG00000059728 | MAX dimerization protein 1 | This gene encodes a member of the MYC/MAX/MAD network of basic helix-loop-helix leucine zipper transcription factors. The MYC/MAX/MAD transcription factors mediate cellular proliferation, differentiation and apoptosis. The encoded protein antagonizes MYC-mediated transcriptional activation of target genes by competing for the binding partner MAX and recruiting repressor complexes containing histone deacetylases. Mutations in this gene may play a role in acute leukemia, and the encoded protein is a potential tumor suppressor. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",18,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[19,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| name | X_id | summary | symbol | query | notfound |
|---|---|---|---|---|---|
| regulator of G-protein signaling 1 | 5996 | This gene encodes a member of the regulator of G-protein signalling family. This protein is located on the cytosolic side of the plasma membrane and contains a conserved, 120 amino acid motif called the RGS domain. The protein attenuates the signalling activity of G-proteins by binding to activated, GTP-bound G alpha subunits and acting as a GTPase activating protein (GAP), increasing the rate of conversion of the GTP to GDP. This hydrolysis allows the G alpha subunits to bind G beta/gamma subunit heterodimers, forming inactive G-protein heterotrimers, thereby terminating the signal. | RGS1 | ENSG00000090104 | NA |
| complement component 7 | 730 | C7 is a component of the complement system. It participates in the formation of Membrane Attack Complex (MAC). People with C7 deficiency are prone to bacterial infection. | C7 | ENSG00000112936 | NA |
| indolethylamine N-methyltransferase | 11185 | N-methylation of endogenous and xenobiotic compounds is a major method by which they are degraded. This gene encodes an enzyme that N-methylates indoles such as tryptamine. Alternative splicing results in multiple transcript variants. Read-through transcription also exists between this gene and the downstream FAM188B (family with sequence similarity 188, member B) gene. | INMT | ENSG00000241644 | NA |
| calponin 1 | 1264 | NA | CNN1 | ENSG00000130176 | NA |
| myelin protein zero like 2 | 10205 | Thymus development depends on a complex series of interactions between thymocytes and the stromal component of the organ. Epithelial V-like antigen (EVA) is expressed in thymus epithelium and strongly downregulated by thymocyte developmental progression. This gene is expressed in the thymus and in several epithelial structures early in embryogenesis. It is highly homologous to the myelin protein zero and, in thymus-derived epithelial cell lines, is poorly soluble in nonionic detergents, strongly suggesting an association to the cytoskeleton. Its capacity to mediate cell adhesion through a homophilic interaction and its selective regulation by T cell maturation might imply the participation of EVA in the earliest phases of thymus organogenesis. The protein bears a characteristic V-type domain and two potential N-glycosylation sites in the extracellular domain; a putative serine phosphorylation site for casein kinase 2 is also present in the cytoplasmic tail. Two transcript variants encoding the same protein have been found for this gene. | MPZL2 | ENSG00000149573 | NA |
| tryptase alpha/beta 1 | 7177 | Tryptases comprise a family of trypsin-like serine proteases, the peptidase family S1. Tryptases are enzymatically active only as heparin-stabilized tetramers, and they are resistant to all known endogenous proteinase inhibitors. Several tryptase genes are clustered on chromosome 16p13.3. These genes are characterized by several distinct features. They have a highly conserved 3’ UTR and contain tandem repeat sequences at the 5’ flank and 3’ UTR which are thought to play a role in regulation of the mRNA stability. These genes have an intron immediately upstream of the initiator Met codon, which separates the site of transcription initiation from protein coding sequence. This feature is characteristic of tryptases but is unusual in other genes. The alleles of this gene exhibit an unusual amount of sequence variation, such that the alleles were once thought to represent two separate genes, alpha and beta 1. Beta tryptases appear to be the main isoenzymes expressed in mast cells; whereas in basophils, alpha tryptases predominate. Tryptases have been implicated as mediators in the pathogenesis of asthma and other allergic and inflammatory disorders. | TPSAB1 | ENSG00000172236 | NA |
| NA | ENSG00000263065 | NA | AF001548.6 | ENSG00000263065 | NA |
| solute carrier organic anion transporter family member 2A1 | 6578 | This gene encodes a prostaglandin transporter that is a member of the 12-membrane-spanning superfamily of transporters. The encoded protein may be involved in mediating the uptake and clearance of prostaglandins in numerous tissues. | SLCO2A1 | ENSG00000174640 | NA |
| actin, gamma 2, smooth muscle, enteric | 72 | Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | ACTG2 | ENSG00000163017 | NA |
| myosin, heavy chain 11, smooth muscle | 4629 | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | MYH11 | ENSG00000133392 | NA |
| C-C motif chemokine ligand 21 | 6366 | This antimicrobial gene is one of several CC cytokine genes clustered on the p-arm of chromosome 9. Cytokines are a family of secreted proteins involved in immunoregulatory and inflammatory processes. The CC cytokines are proteins characterized by two adjacent cysteines. Similar to other chemokines the protein encoded by this gene inhibits hemopoiesis and stimulates chemotaxis. This protein is chemotactic in vitro for thymocytes and activated T cells, but not for B cells, macrophages, or neutrophils. The cytokine encoded by this gene may also play a role in mediating homing of lymphocytes to secondary lymphoid organs. It is a high affinity functional ligand for chemokine receptor 7 that is expressed on T and B lymphocytes and a known receptor for another member of the cytokine family (small inducible cytokine A19). | CCL21 | ENSG00000137077 | NA |
| C-C motif chemokine ligand 19 | 6363 | This antimicrobial gene is one of several CC cytokine genes clustered on the p-arm of chromosome 9. Cytokines are a family of secreted proteins involved in immunoregulatory and inflammatory processes. The CC cytokines are proteins characterized by two adjacent cysteines. The cytokine encoded by this gene may play a role in normal lymphocyte recirculation and homing. It also plays an important role in trafficking of T cells in thymus, and in T cell and B cell migration to secondary lymphoid organs. It specifically binds to chemokine receptor CCR7. | CCL19 | ENSG00000172724 | NA |
| charged multivesicular body protein 4C | 92421 | CHMP4C belongs to the chromatin-modifying protein/charged multivesicular body protein (CHMP) family. These proteins are components of ESCRT-III (endosomal sorting complex required for transport III), a complex involved in degradation of surface receptor proteins and formation of endocytic multivesicular bodies (MVBs). Some CHMPs have both nuclear and cytoplasmic/vesicular distributions, and one such CHMP, CHMP1A (MIM 164010), is required for both MVB formation and regulation of cell cycle progression (Tsang et al., 2006 [PubMed 16730941]). | CHMP4C | ENSG00000164695 | NA |
| mucin 1, cell surface associated | 4582 | This gene encodes a membrane-bound protein that is a member of the mucin family. Mucins are O-glycosylated proteins that play an essential role in forming protective mucous barriers on epithelial surfaces. These proteins also play a role in intracellular signaling. This protein is expressed on the apical surface of epithelial cells that line the mucosal surfaces of many different tissues including lung, breast stomach and pancreas. This protein is proteolytically cleaved into alpha and beta subunits that form a heterodimeric complex. The N-terminal alpha subunit functions in cell-adhesion and the C-terminal beta subunit is involved in cell signaling. Overexpression, aberrant intracellular localization, and changes in glycosylation of this protein have been associated with carcinomas. This gene is known to contain a highly polymorphic variable number tandem repeats (VNTR) domain. Alternate splicing results in multiple transcript variants. | MUC1 | ENSG00000185499 | NA |
| gap junction protein beta 2 | 2706 | This gene encodes a member of the gap junction protein family. The gap junctions were first characterized by electron microscopy as regionally specialized structures on plasma membranes of contacting adherent cells. These structures were shown to consist of cell-to-cell channels that facilitate the transfer of ions and small molecules between cells. The gap junction proteins, also known as connexins, purified from fractions of enriched gap junctions from different tissues differ. According to sequence similarities at the nucleotide and amino acid levels, the gap junction proteins are divided into two categories, alpha and beta. Mutations in this gene are responsible for as much as 50% of pre-lingual, recessive deafness. | GJB2 | ENSG00000165474 | NA |
| osteoglycin | 4969 | This gene encodes a member of the small leucine-rich proteoglycan (SLRP) family of proteins. The encoded protein induces ectopic bone formation in conjunction with transforming growth factor beta and may regulate osteoblast differentiation. High expression of the encoded protein may be associated with elevated heart left ventricular mass. Alternative splicing results in multiple transcript variants. | OGN | ENSG00000106809 | NA |
| NA | NA | NA | NA | ENSG00000259716 | TRUE |
| apolipoprotein L4 | 80832 | The protein encoded by this gene is a member of the apolipoprotein L family and may play a role in lipid exchange and transport throughout the body, as well as in reverse cholesterol transport from peripheral cells to the liver. Two transcript variants encoding two different isoforms have been found for this gene. Only one of the isoforms appears to be a secreted protein. | APOL4 | ENSG00000100336 | NA |
| myosin light chain 9 | 10398 | Myosin, a structural component of muscle, consists of two heavy chains and four light chains. The protein encoded by this gene is a myosin light chain that may regulate muscle contraction by modulating the ATPase activity of myosin heads. The encoded protein binds calcium and is activated by myosin light chain kinase. Two transcript variants encoding different isoforms have been found for this gene. | MYL9 | ENSG00000101335 | NA |
| cadherin EGF LAG seven-pass G-type receptor 1 | 9620 | The protein encoded by this gene is a member of the flamingo subfamily, part of the cadherin superfamily. The flamingo subfamily consists of nonclassic-type cadherins; a subpopulation that does not interact with catenins. The flamingo cadherins are located at the plasma membrane and have nine cadherin domains, seven epidermal growth factor-like repeats and two laminin A G-type repeats in their ectodomain. They also have seven transmembrane domains, a characteristic unique to this subfamily. It is postulated that these proteins are receptors involved in contact-mediated communication, with cadherin domains acting as homophilic binding regions and the EGF-like domains involved in cell adhesion and receptor-ligand interactions. This particular member is a developmentally regulated, neural-specific gene which plays an unspecified role in early embryogenesis. | CELSR1 | ENSG00000075275 | NA |
| NA | NA | NA | NA | ENSG00000187990 | TRUE |
| keratin 8 | 3856 | This gene is a member of the type II keratin family clustered on the long arm of chromosome 12. Type I and type II keratins heteropolymerize to form intermediate-sized filaments in the cytoplasm of epithelial cells. The product of this gene typically dimerizes with keratin 18 to form an intermediate filament in simple single-layered epithelial cells. This protein plays a role in maintaining cellular structural integrity and also functions in signal transduction and cellular differentiation. Mutations in this gene cause cryptogenic cirrhosis. Alternatively spliced transcript variants have been found for this gene. | KRT8 | ENSG00000170421 | NA |
| myocilin | 4653 | MYOC encodes the protein myocilin, which is believed to have a role in cytoskeletal function. MYOC is expressed in many occular tissues, including the trabecular meshwork, and was revealed to be the trabecular meshwork glucocorticoid-inducible response protein (TIGR). The trabecular meshwork is a specialized eye tissue essential in regulating intraocular pressure, and mutations in MYOC have been identified as the cause of hereditary juvenile-onset open-angle glaucoma. | MYOC | ENSG00000034971 | NA |
| NA | NA | NA | NA | ENSG00000180672 | TRUE |
| nephronectin | 255743 | NA | NPNT | ENSG00000168743 | NA |
| NA | ENSG00000269936 | NA | RP11-394O4.5 | ENSG00000269936 | NA |
| actin, alpha 2, smooth muscle, aorta | 59 | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | ACTA2 | ENSG00000107796 | NA |
| latent transforming growth factor beta binding protein 4 | 8425 | The protein encoded by this gene binds transforming growth factor beta (TGFB) as it is secreted and targeted to the extracellular matrix. TGFB is biologically latent after secretion and insertion into the extracellular matrix, and sheds TGFB and other proteins upon activation. Defects in this gene may be a cause of cutis laxa and severe pulmonary, gastrointestinal, and urinary abnormalities. Three transcript variants encoding different isoforms have been found for this gene. | LTBP4 | ENSG00000090006 | NA |
| lipin 3 | 64900 | The protein encoded by this gene is a member of the lipin family of proteins, and all family members share strong homology in their C-terminal region. This protein is thought to form hetero-oligomers with other lipin family members, while one family member, lipin 1, can also form homo-oligomers. This protein contains conserved motifs for phosphatidate phosphatase 1 (PAP1) activity as well as a domain that interacts with a transcriptional co-activator. Lipin complexes act in the cytoplasm to catalyze the dephosphorylation of phosphatidic acid to produce diacylglycerol, which is the precursor of both triglycerides and phospholipids. Lipin complexes are also thought to regulate gene expression as transcriptional co-activators in the nucleus. Alternative splicing results in multiple transcript variants. | LPIN3 | ENSG00000132793 | NA |
| LY6/PLAUR domain containing 3 | 27076 | NA | LYPD3 | ENSG00000124466 | NA |
| ACTA2 antisense RNA 1 | ENSG00000180139 | NA | ACTA2-AS1 | ENSG00000180139 | NA |
| aquaporin 3 (Gill blood group) | 360 | This gene encodes the water channel protein aquaporin 3. Aquaporins are a family of small integral membrane proteins related to the major intrinsic protein, also known as aquaporin 0. Aquaporin 3 is localized at the basal lateral membranes of collecting duct cells in the kidney. In addition to its water channel function, aquaporin 3 has been found to facilitate the transport of nonionic small solutes such as urea and glycerol, but to a smaller degree. It has been suggested that water channels can be functionally heterogeneous and possess water and solute permeation mechanisms. Alternative splicing of this gene results in multiple transcript variants encoding different isoforms. | AQP3 | ENSG00000165272 | NA |
| carboxypeptidase X (M14 family), member 2 | 119587 | NA | CPXM2 | ENSG00000121898 | NA |
| kinesin family member 23 | 9493 | The protein encoded by this gene is a member of kinesin-like protein family. This family includes microtubule-dependent molecular motors that transport organelles within cells and move chromosomes during cell division. This protein has been shown to cross-bridge antiparallel microtubules and drive microtubule movement in vitro. Alternate splicing of this gene results in multiple transcript variants. | KIF23 | ENSG00000137807 | NA |
| NA | ENSG00000232993 | NA | RP11-334A14.5 | ENSG00000232993 | NA |
| netrin 1 | 9423 | Netrin is included in a family of laminin-related secreted proteins. The function of this gene has not yet been defined; however, netrin is thought to be involved in axon guidance and cell migration during development. Mutations and loss of expression of netrin suggest that variation in netrin may be involved in cancer development. | NTN1 | ENSG00000065320 | NA |
| prolyl 3-hydroxylase 2 | 55214 | This gene encodes a member of the prolyl 3-hydroxylase subfamily of 2-oxo-glutarate-dependent dioxygenases. These enzymes play a critical role in collagen chain assembly, stability and cross-linking by catalyzing post-translational 3-hydroxylation of proline residues. Mutations in this gene are associated with nonsyndromic severe myopia with cataract and vitreoretinal degeneration, and downregulation of this gene may play a role in breast cancer. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | P3H2 | ENSG00000090530 | NA |
| interleukin 1 receptor antagonist | 3557 | The protein encoded by this gene is a member of the interleukin 1 cytokine family. This protein inhibits the activities of interleukin 1, alpha (IL1A) and interleukin 1, beta (IL1B), and modulates a variety of interleukin 1 related immune and inflammatory responses. This gene and five other closely related cytokine genes form a gene cluster spanning approximately 400 kb on chromosome 2. A polymorphism of this gene is reported to be associated with increased risk of osteoporotic fractures and gastric cancer. Several alternatively spliced transcript variants encoding distinct isoforms have been reported. | IL1RN | ENSG00000136689 | NA |
| plasmalemma vesicle associated protein | 83483 | NA | PLVAP | ENSG00000130300 | NA |
| family with sequence similarity 46 member B | 115572 | NA | FAM46B | ENSG00000158246 | NA |
| NA | ENSG00000263335 | NA | AF001548.5 | ENSG00000263335 | NA |
| cingulin-like 1 | 84952 | This gene encodes a member of the cingulin family. The encoded protein localizes to both adherens and tight cell-cell junctions and mediates junction assembly and maintenance by regulating the activity of the small GTPases RhoA and Rac1. Heterozygous chromosomal rearrangements resulting in association of the promoter for this gene with the aromatase gene are a cause of aromatase excess syndrome. Alternatively spliced transcript variants have been observed for this gene. | CGNL1 | ENSG00000128849 | NA |
| RNA binding protein with multiple splicing 2 | 348093 | NA | RBPMS2 | ENSG00000166831 | NA |
| NA | ENSG00000249007 | NA | RP11-510N19.5 | ENSG00000249007 | NA |
| proline and arginine rich end leucine rich repeat protein | 5549 | The protein encoded by this gene is a leucine-rich repeat protein present in connective tissue extracellular matrix. This protein functions as a molecule anchoring basement membranes to the underlying connective tissue. This protein has been shown to bind type I collagen to basement membranes and type II collagen to cartilage. It also binds the basement membrane heparan sulfate proteoglycan perlecan. This protein is suggested to be involved in the pathogenesis of Hutchinson-Gilford progeria (HGP), which is reported to lack the binding of collagen in basement membranes and cartilage. Alternatively spliced transcript variants encoding the same protein have been observed. | PRELP | ENSG00000188783 | NA |
| HOXA transcript antisense RNA, myeloid-specific 1 | ENSG00000233429 | NA | HOTAIRM1 | ENSG00000233429 | NA |
| sosondowah ankyrin repeat domain family member C | 65124 | NA | SOWAHC | ENSG00000198142 | NA |
| NA | ENSG00000253520 | NA | RP11-798K23.5 | ENSG00000253520 | NA |
| potassium two pore domain channel subfamily K member 6 | 9424 | This gene encodes one of the members of the superfamily of potassium channel proteins containing two pore-forming P domains. This channel protein, considered an open rectifier, is widely expressed. It is stimulated by arachidonic acid, and inhibited by internal acidification and volatile anaesthetics. | KCNK6 | ENSG00000099337 | NA |
| colorectal cancer associated 2 | 120376 | NA | COLCA2 | ENSG00000214290 | NA |
| cytochrome P450 family 2 subfamily S member 1 | 29785 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum. In rodents, the homologous protein has been shown to metabolize certain carcinogens; however, the specific function of the human protein has not been determined. | CYP2S1 | ENSG00000167600 | NA |
| lumican | 4060 | This gene encodes a member of the small leucine-rich proteoglycan (SLRP) family that includes decorin, biglycan, fibromodulin, keratocan, epiphycan, and osteoglycin. In these bifunctional molecules, the protein moiety binds collagen fibrils and the highly charged hydrophilic glycosaminoglycans regulate interfibrillar spacings. Lumican is the major keratan sulfate proteoglycan of the cornea but is also distributed in interstitial collagenous matrices throughout the body. Lumican may regulate collagen fibril organization and circumferential growth, corneal transparency, and epithelial cell migration and tissue repair. | LUM | ENSG00000139329 | NA |
| NOTCH1 associated lncRNA in T-cell acute lymphoblastic leukemia 1 | ENSG00000237886 | NA | NALT1 | ENSG00000237886 | NA |
| NA | ENSG00000271133 | NA | CTA-293F17.1 | ENSG00000271133 | NA |
| secreted frizzled related protein 2 | 6423 | This gene encodes a member of the SFRP family that contains a cysteine-rich domain homologous to the putative Wnt-binding site of Frizzled proteins. SFRPs act as soluble modulators of Wnt signaling. Methylation of this gene is a potential marker for the presence of colorectal cancer. | SFRP2 | ENSG00000145423 | NA |
| matrix Gla protein | 4256 | The protein encoded by this gene is secreted and likely acts as an inhibitor of bone formation. The encoded protein is found in the organic matrix of bone and cartilage. Defects in this gene are a cause of Keutel syndrome (KS). Two transcript variants encoding different isoforms have been found for this gene. | MGP | ENSG00000111341 | NA |
| small nucleolar RNA host gene 18 | ENSG00000250786 | NA | SNHG18 | ENSG00000250786 | NA |
| alcohol dehydrogenase 1B (class I), beta polypeptide | 125 | The protein encoded by this gene is a member of the alcohol dehydrogenase family. Members of this enzyme family metabolize a wide variety of substrates, including ethanol, retinol, other aliphatic alcohols, hydroxysteroids, and lipid peroxidation products. This encoded protein, consisting of several homo- and heterodimers of alpha, beta, and gamma subunits, exhibits high activity for ethanol oxidation and plays a major role in ethanol catabolism. Three genes encoding alpha, beta and gamma subunits are tandemly organized in a genomic segment as a gene cluster. Two transcript variants encoding different isoforms have been found for this gene. | ADH1B | ENSG00000196616 | NA |
| purinergic receptor P2Y1 | 5028 | The product of this gene belongs to the family of G-protein coupled receptors. This family has several receptor subtypes with different pharmacological selectivity, which overlaps in some cases, for various adenosine and uridine nucleotides. This receptor functions as a receptor for extracellular ATP and ADP. In platelets binding to ADP leads to mobilization of intracellular calcium ions via activation of phospholipase C, a change in platelet shape, and probably to platelet aggregation. | P2RY1 | ENSG00000169860 | NA |
| NA | NA | NA | NA | ENSG00000268913 | TRUE |
| CDC42 effector protein 5 | 148170 | Cell division control protein 42 (CDC42), a small Rho GTPase, regulates the formation of F-actin-containing structures through its interaction with the downstream effector proteins. The protein encoded by this gene is a member of the Borg (binder of Rho GTPases) family of CDC42 effector proteins. Borg family proteins contain a CRIB (Cdc42/Rac interactive-binding) domain. They bind to CDC42 and regulate its function negatively. The encoded protein may inhibit c-Jun N-terminal kinase (JNK) independently of CDC42 binding. The protein may also play a role in septin organization and inducing pseudopodia formation in fibroblasts | CDC42EP5 | ENSG00000167617 | NA |
| tumor-associated calcium signal transducer 2 | 4070 | This intronless gene encodes a carcinoma-associated antigen. This antigen is a cell surface receptor that transduces calcium signals. Mutations of this gene have been associated with gelatinous drop-like corneal dystrophy. | TACSTD2 | ENSG00000184292 | NA |
| baculoviral IAP repeat containing 3 | 330 | This gene encodes a member of the IAP family of proteins that inhibit apoptosis by binding to tumor necrosis factor receptor-associated factors TRAF1 and TRAF2, probably by interfering with activation of ICE-like proteases. The encoded protein inhibits apoptosis induced by serum deprivation but does not affect apoptosis resulting from exposure to menadione, a potent inducer of free radicals. It contains 3 baculovirus IAP repeats and a ring finger domain. Transcript variants encoding the same isoform have been identified. | BIRC3 | ENSG00000023445 | NA |
| phospholipase A2 group V | 5322 | This gene is a member of the secretory phospholipase A2 family. It is located in a tightly-linked cluster of secretory phospholipase A2 genes on chromosome 1. The encoded enzyme catalyzes the hydrolysis of membrane phospholipids to generate lysophospholipids and free fatty acids including arachidonic acid. It preferentially hydrolyzes linoleoyl-containing phosphatidylcholine substrates. Secretion of this enzyme is thought to induce inflammatory responses in neighboring cells. Alternatively spliced transcript variants have been found, but their full-length nature has not been determined. | PLA2G5 | ENSG00000127472 | NA |
| podocan | 127435 | NA | PODN | ENSG00000174348 | NA |
| desmoplakin | 1832 | This gene encodes a protein that anchors intermediate filaments to desmosomal plaques and forms an obligate component of functional desmosomes. Mutations in this gene are the cause of several cardiomyopathies and keratodermas, including skin fragility-woolly hair syndrome. Alternative splicing results in multiple transcript variants. | DSP | ENSG00000096696 | NA |
| erythrocyte membrane protein band 4.1 like 4A | 64097 | Members of the band 4.1 protein superfamily, including EPB41L4A, are thought to regulate the interaction between the cytoskeleton and plasma membrane (Ishiguro et al., 2000 [PubMed 10874211]). | EPB41L4A | ENSG00000129595 | NA |
| pleckstrin homology domain containing A4 | 57664 | NA | PLEKHA4 | ENSG00000105559 | NA |
| EPH receptor A2 | 1969 | This gene belongs to the ephrin receptor subfamily of the protein-tyrosine kinase family. EPH and EPH-related receptors have been implicated in mediating developmental events, particularly in the nervous system. Receptors in the EPH subfamily typically have a single kinase domain and an extracellular region containing a Cys-rich domain and 2 fibronectin type III repeats. The ephrin receptors are divided into 2 groups based on the similarity of their extracellular domain sequences and their affinities for binding ephrin-A and ephrin-B ligands. This gene encodes a protein that binds ephrin-A ligands. Mutations in this gene are the cause of certain genetically-related cataract disorders. | EPHA2 | ENSG00000142627 | NA |
| cell adhesion molecule L1 like | 10752 | The protein encoded by this gene is a member of the L1 gene family of neural cell adhesion molecules. It is a neural recognition molecule that may be involved in signal transduction pathways. The deletion of one copy of this gene may be responsible for mental defects in patients with 3p- syndrome. This protein may also play a role in the growth of certain cancers. Alternate splicing results in both coding and non-coding variants. | CHL1 | ENSG00000134121 | NA |
| acid phosphatase 5, tartrate resistant | 54 | This gene encodes an iron containing glycoprotein which catalyzes the conversion of orthophosphoric monoester to alcohol and orthophosphate. It is the most basic of the acid phosphatases and is the only form not inhibited by L(+)-tartrate. | ACP5 | ENSG00000102575 | NA |
| phospholipase A2 group IIA | 5320 | The protein encoded by this gene is a member of the phospholipase A2 family (PLA2). PLA2s constitute a diverse family of enzymes with respect to sequence, function, localization, and divalent cation requirements. This gene product belongs to group II, which contains secreted form of PLA2, an extracellular enzyme that has a low molecular mass and requires calcium ions for catalysis. It catalyzes the hydrolysis of the sn-2 fatty acid acyl ester bond of phosphoglycerides, releasing free fatty acids and lysophospholipids, and thought to participate in the regulation of the phospholipid metabolism in biomembranes. Several alternatively spliced transcript variants with different 5’ UTRs have been found for this gene. | PLA2G2A | ENSG00000188257 | NA |
| RAS like family 12 | 51285 | NA | RASL12 | ENSG00000103710 | NA |
| microfibrillar associated protein 4 | 4239 | This gene encodes a protein with similarity to a bovine microfibril-associated protein. The protein has binding specificities for both collagen and carbohydrate. It is thought to be an extracellular matrix protein which is involved in cell adhesion or intercellular interactions. The gene is located within the Smith-Magenis syndrome region. Two transcript variants encoding different isoforms have been found for this gene. | MFAP4 | ENSG00000166482 | NA |
| epithelial membrane protein 1 | 2012 | NA | EMP1 | ENSG00000134531 | NA |
| PERP, TP53 apoptosis effector | 64065 | NA | PERP | ENSG00000112378 | NA |
| chromosome 15 open reading frame 48 | 84419 | This gene was first identified in a study of human esophageal squamous cell carcinoma tissues. Levels of both the message and protein are reduced in carcinoma samples. In adult human tissues, this gene is expressed in the the esophagus, stomach, small intestine, colon and placenta. Alternatively spliced transcript variants that encode the same protein have been identified. | C15orf48 | ENSG00000166920 | NA |
| ADIRF antisense RNA 1 | ENSG00000272734 | NA | ADIRF-AS1 | ENSG00000272734 | NA |
| chromosome 3 open reading frame 52 | 79669 | NA | C3orf52 | ENSG00000114529 | NA |
| von Willebrand factor A domain containing 1 | 64856 | VWA1 belongs to the von Willebrand factor (VWF; MIM 613160) A (VWFA) domain superfamily of extracellular matrix proteins and appears to play a role in cartilage structure and function (Fitzgerald et al., 2002 [PubMed 12062410]). | VWA1 | ENSG00000179403 | NA |
| tumor necrosis factor receptor superfamily member 19 | 55504 | The protein encoded by this gene is a member of the TNF-receptor superfamily. This receptor is highly expressed during embryonic development. It has been shown to interact with TRAF family members, and to activate JNK signaling pathway when overexpressed in cells. This receptor is capable of inducing apoptosis by a caspase-independent mechanism, and it is thought to play an essential role in embryonic development. Alternatively spliced transcript variants encoding distinct isoforms have been described. | TNFRSF19 | ENSG00000127863 | NA |
| chloride intracellular channel 6 | 54102 | This gene encodes a member of the chloride intracellular channel family of proteins. The gene is part of a large triplicated region found on chromosomes 1, 6, and 21. Alternative splicing results in multiple transcript variants encoding different isoforms. | CLIC6 | ENSG00000159212 | NA |
| uncharacterized LOC100506314 | 100506314 | NA | LOC100506314 | ENSG00000247498 | NA |
| insulin like growth factor binding protein 2 | 3485 | The protein encoded by this gene is one of six similar proteins that bind insulin-like growth factors I and II (IGF-I and IGF-II). The encoded protein can be secreted into the bloodstream, where it binds IGF-I and IGF-II with high affinity, or it can remain intracellular, interacting with many different ligands. High expression levels of this protein promote the growth of several types of tumors and may be predictive of the chances of recovery of the patient. Several transcript variants, one encoding a secreted isoform and the others encoding nonsecreted isoforms, have been found for this gene. | IGFBP2 | ENSG00000115457 | NA |
| aldehyde dehydrogenase 3 family member A1 | 218 | Aldehyde dehydrogenases oxidize various aldehydes to the corresponding acids. They are involved in the detoxification of alcohol-derived acetaldehyde and in the metabolism of corticosteroids, biogenic amines, neurotransmitters, and lipid peroxidation. The enzyme encoded by this gene forms a cytoplasmic homodimer that preferentially oxidizes aromatic and medium-chain (6 carbons or more) saturated and unsaturated aldehyde substrates. It is thought to promote resistance to UV and 4-hydroxy-2-nonenal-induced oxidative damage in the cornea. The gene is located within the Smith-Magenis syndrome region on chromosome 17. Multiple alternatively spliced variants, encoding the same protein, have been identified. | ALDH3A1 | ENSG00000108602 | NA |
| syntaxin 19 | 415117 | NA | STX19 | ENSG00000178750 | NA |
| phospholamban | 5350 | The protein encoded by this gene is found as a pentamer and is a major substrate for the cAMP-dependent protein kinase in cardiac muscle. The encoded protein is an inhibitor of cardiac muscle sarcoplasmic reticulum Ca(2+)-ATPase in the unphosphorylated state, but inhibition is relieved upon phosphorylation of the protein. The subsequent activation of the Ca(2+) pump leads to enhanced muscle relaxation rates, thereby contributing to the inotropic response elicited in heart by beta-agonists. The encoded protein is a key regulator of cardiac diastolic function. Mutations in this gene are a cause of inherited human dilated cardiomyopathy with refractory congestive heart failure, and also familial hypertrophic cardiomyopathy. | PLN | ENSG00000198523 | NA |
| G protein-coupled receptor class C group 5 member A | 9052 | This gene encodes a member of the type 3 G protein-coupling receptor family, characterized by the signature 7-transmembrane domain motif. The encoded protein may be involved in interaction between retinoid acid and G protein signalling pathways. Retinoic acid plays a critical role in development, cellular growth, and differentiation. This gene may play a role in embryonic development and epithelial cell differentiation. | GPRC5A | ENSG00000013588 | NA |
| retinoic acid receptor responder 1 | 5918 | This gene was identified as a retinoid acid (RA) receptor-responsive gene. It encodes a type 1 membrane protein. The expression of this gene is upregulated by tazarotene as well as by retinoic acid receptors. The expression of this gene is found to be downregulated in prostate cancer, which is caused by the methylation of its promoter and CpG island. Alternatively spliced transcript variant encoding distinct isoforms have been observed. | RARRES1 | ENSG00000118849 | NA |
| small cell adhesion glycoprotein | 57228 | NA | SMAGP | ENSG00000170545 | NA |
| collagen type III alpha 1 chain | 1281 | This gene encodes the pro-alpha1 chains of type III collagen, a fibrillar collagen that is found in extensible connective tissues such as skin, lung, uterus, intestine and the vascular system, frequently in association with type I collagen. Mutations in this gene are associated with Ehlers-Danlos syndrome types IV, and with aortic and arterial aneurysms. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | COL3A1 | ENSG00000168542 | NA |
| stratifin | 2810 | NA | SFN | ENSG00000175793 | NA |
| tandem C2 domains, nuclear | 123036 | NA | TC2N | ENSG00000165929 | NA |
| cyclin D1 | 595 | The protein encoded by this gene belongs to the highly conserved cyclin family, whose members are characterized by a dramatic periodicity in protein abundance throughout the cell cycle. Cyclins function as regulators of CDK kinases. Different cyclins exhibit distinct expression and degradation patterns which contribute to the temporal coordination of each mitotic event. This cyclin forms a complex with and functions as a regulatory subunit of CDK4 or CDK6, whose activity is required for cell cycle G1/S transition. This protein has been shown to interact with tumor suppressor protein Rb and the expression of this gene is regulated positively by Rb. Mutations, amplification and overexpression of this gene, which alters cell cycle progression, are observed frequently in a variety of tumors and may contribute to tumorigenesis. | CCND1 | ENSG00000110092 | NA |
| laminin subunit alpha 3 | 3909 | The protein encoded by this gene belongs to the laminin family of secreted molecules. Laminins are heterotrimeric molecules that consist of alpha, beta, and gamma subunits that assemble through a coiled-coil domain. Laminins are essential for formation and function of the basement membrane and have additional functions in regulating cell migration and mechanical signal transduction. This gene encodes an alpha subunit and is responsive to several epithelial-mesenchymal regulators including keratinocyte growth factor, epidermal growth factor and insulin-like growth factor. Mutations in this gene have been identified as the cause of Herlitz type junctional epidermolysis bullosa and laryngoonychocutaneous syndrome. Alternative splicing and alternative promoter usage result in multiple transcript variants. | LAMA3 | ENSG00000053747 | NA |
| superoxide dismutase 3, extracellular | 6649 | This gene encodes a member of the superoxide dismutase (SOD) protein family. SODs are antioxidant enzymes that catalyze the conversion of superoxide radicals into hydrogen peroxide and oxygen, which may protect the brain, lungs, and other tissues from oxidative stress. Proteolytic processing of the encoded protein results in the formation of two distinct homotetramers that differ in their ability to interact with the extracellular matrix (ECM). Homotetramers consisting of the intact protein, or type C subunit, exhibit high affinity for heparin and are anchored to the ECM. Homotetramers consisting of a proteolytically cleaved form of the protein, or type A subunit, exhibit low affinity for heparin and do not interact with the ECM. A mutation in this gene may be associated with increased heart disease risk. | SOD3 | ENSG00000109610 | NA |
| EPS8 like 1 | 54869 | This gene encodes a protein that is related to epidermal growth factor receptor pathway substrate 8 (EPS8), a substrate for the epidermal growth factor receptor. The function of this protein is unknown. At least two alternatively spliced transcript variants encoding different isoforms have been found for this gene. | EPS8L1 | ENSG00000131037 | NA |
| EvC ciliary complex subunit 1 | 2121 | This gene encodes a protein containing a leucine zipper and a transmembrane domain. This gene has been implicated in both Ellis-van Creveld syndrome (EvC) and Weyers acrodental dysostosis. | EVC | ENSG00000072840 | NA |
| retinoic acid receptor responder 2 | 5919 | This gene encodes a secreted chemotactic protein that initiates chemotaxis via the ChemR23 G protein-coupled seven-transmembrane domain ligand. Expression of this gene is upregulated by the synthetic retinoid tazarotene and occurs in a wide variety of tissues. The active protein has several roles, including that as an adipokine and as an antimicrobial protein with activity against bacteria and fungi. | RARRES2 | ENSG00000106538 | NA |
| leucine rich repeat containing 3 | 81543 | NA | LRRC3 | ENSG00000160233 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",19,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[20,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | query | summary | name | X_id | notfound |
|---|---|---|---|---|---|
| MAL | ENSG00000172005 | The protein encoded by this gene is a highly hydrophobic integral membrane protein belonging to the MAL family of proteolipids. The protein has been localized to the endoplasmic reticulum of T-cells and is a candidate linker protein in T-cell signal transduction. In addition, this proteolipid is localized in compact myelin of cells in the nervous system and has been implicated in myelin biogenesis and/or function. The protein plays a role in the formation, stabilization and maintenance of glycosphingolipid-enriched membrane microdomains. Down-regulation of this gene has been associated with a variety of human epithelial malignancies. Alternative splicing produces four transcript variants which vary from each other by the presence or absence of alternatively spliced exons 2 and 3. | mal, T-cell differentiation protein | 4118 | NA |
| CLIC3 | ENSG00000169583 | Chloride channels are a diverse group of proteins that regulate fundamental cellular processes including stabilization of cell membrane potential, transepithelial transport, maintenance of intracellular pH, and regulation of cell volume. Chloride intracellular channel 3 is a member of the p64 family and is predominantly localized in the nucleus and stimulates chloride ion channel activity. In addition, this protein may participate in cellular growth control, based on its association with ERK7, a member of the MAP kinase family. | chloride intracellular channel 3 | 9022 | NA |
| AATK | ENSG00000181409 | The protein encoded by this gene contains a tyrosine kinase domain at the N-terminus and a proline-rich domain at the C-terminus. This gene is induced during apoptosis, and expression of this gene may be a necessary pre-requisite for the induction of growth arrest and/or apoptosis of myeloid precursor cells. This gene has been shown to produce neuronal differentiation in a neuroblastoma cell line. Two transcript variants encoding different isoforms have been found for this gene. | apoptosis-associated tyrosine kinase | 9625 | NA |
| LYZ | ENSG00000090382 | This gene encodes human lysozyme, whose natural substrate is the bacterial cell wall peptidoglycan (cleaving the beta[1-4]glycosidic linkages between N-acetylmuramic acid and N-acetylglucosamine). Lysozyme is one of the antimicrobial agents found in human milk, and is also present in spleen, lung, kidney, white blood cells, plasma, saliva, and tears. The protein has antibacterial activity against a number of bacterial species. Missense mutations in this gene have been identified in heritable renal amyloidosis. | lysozyme | 4069 | NA |
| RNASE2 | ENSG00000169385 | The protein encoded by this gene is a non-secretory ribonuclease that belongs to the pancreatic ribonuclease family, a subset of the ribonuclease A superfamily. The protein antimicrobial activity against viruses. | ribonuclease A family member 2 | 6036 | NA |
| SMIM5 | ENSG00000204323 | NA | small integral membrane protein 5 | 643008 | NA |
| RP11-1143G9.4 | ENSG00000257764 | NA | NA | ENSG00000257764 | NA |
| CDA | ENSG00000158825 | This gene encodes an enzyme involved in pyrimidine salvaging. The encoded protein forms a homotetramer that catalyzes the irreversible hydrolytic deamination of cytidine and deoxycytidine to uridine and deoxyuridine, respectively. It is one of several deaminases responsible for maintaining the cellular pyrimidine pool. Mutations in this gene are associated with decreased sensitivity to the cytosine nucleoside analogue cytosine arabinoside used in the treatment of certain childhood leukemias. | cytidine deaminase | 978 | NA |
| PDLIM4 | ENSG00000131435 | This gene encodes a protein which may be involved in bone development. Mutations in this gene are associated with susceptibility to osteoporosis. | PDZ and LIM domain 4 | 8572 | NA |
| CHAC1 | ENSG00000128965 | NA | ChaC glutathione specific gamma-glutamylcyclotransferase 1 | 79094 | NA |
| CDC42EP5 | ENSG00000167617 | Cell division control protein 42 (CDC42), a small Rho GTPase, regulates the formation of F-actin-containing structures through its interaction with the downstream effector proteins. The protein encoded by this gene is a member of the Borg (binder of Rho GTPases) family of CDC42 effector proteins. Borg family proteins contain a CRIB (Cdc42/Rac interactive-binding) domain. They bind to CDC42 and regulate its function negatively. The encoded protein may inhibit c-Jun N-terminal kinase (JNK) independently of CDC42 binding. The protein may also play a role in septin organization and inducing pseudopodia formation in fibroblasts | CDC42 effector protein 5 | 148170 | NA |
| LIF | ENSG00000128342 | The protein encoded by this gene is a pleiotropic cytokine with roles in several different systems. It is involved in the induction of hematopoietic differentiation in normal and myeloid leukemia cells, induction of neuronal cell differentiation, regulator of mesenchymal to epithelial conversion during kidney development, and may also have a role in immune tolerance at the maternal-fetal interface. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | leukemia inhibitory factor | 3976 | NA |
| COL1A1 | ENSG00000108821 | This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIA, Ehlers-Danlos syndrome Classical type, Caffey Disease and idiopathic osteoporosis. Reciprocal translocations between chromosomes 17 and 22, where this gene and the gene for platelet-derived growth factor beta are located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans, resulting from unregulated expression of the growth factor. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | collagen type I alpha 1 | 1277 | NA |
| FAM83D | ENSG00000101447 | NA | family with sequence similarity 83 member D | 81610 | NA |
| AEBP1 | ENSG00000106624 | This gene encodes a member of carboxypeptidase A protein family. The encoded protein may function as a transcriptional repressor and play a role in adipogenesis and smooth muscle cell differentiation. Studies in mice suggest that this gene functions in wound healing and abdominal wall development. Overexpression of this gene is associated with glioblastoma. | AE binding protein 1 | 165 | NA |
| TUBB6 | ENSG00000176014 | NA | tubulin beta 6 class V | 84617 | NA |
| TMEM52 | ENSG00000178821 | NA | transmembrane protein 52 | 339456 | NA |
| IGSF6 | ENSG00000140749 | NA | immunoglobulin superfamily member 6 | 10261 | NA |
| CGA | ENSG00000135346 | The four human glycoprotein hormones chorionic gonadotropin (CG), luteinizing hormone (LH), follicle stimulating hormone (FSH), and thyroid stimulating hormone (TSH) are dimers consisting of alpha and beta subunits that are associated noncovalently. The alpha subunits of these hormones are identical, however, their beta chains are unique and confer biological specificity. The protein encoded by this gene is the alpha subunit and belongs to the glycoprotein hormones alpha chain family. Two transcript variants encoding different isoforms have been found for this gene. | glycoprotein hormones, alpha polypeptide | 1081 | NA |
| RAP1GAP | ENSG00000076864 | This gene encodes a type of GTPase-activating-protein (GAP) that down-regulates the activity of the ras-related RAP1 protein. RAP1 acts as a molecular switch by cycling between an inactive GDP-bound form and an active GTP-bound form. The product of this gene, RAP1GAP, promotes the hydrolysis of bound GTP and hence returns RAP1 to the inactive state whereas other proteins, guanine nucleotide exchange factors (GEFs), act as RAP1 activators by facilitating the conversion of RAP1 from the GDP- to the GTP-bound form. In general, ras subfamily proteins, such as RAP1, play key roles in receptor-linked signaling pathways that control cell growth and differentiation. RAP1 plays a role in diverse processes such as cell proliferation, adhesion, differentiation, and embryogenesis. Alternative splicing results in multiple transcript variants encoding distinct proteins. | RAP1 GTPase activating protein | 5909 | NA |
| ATP8B1 | ENSG00000081923 | This gene encodes a member of the P-type cation transport ATPase family, which belongs to the subfamily of aminophospholipid-transporting ATPases. The aminophospholipid translocases transport phosphatidylserine and phosphatidylethanolamine from one side of a bilayer to another. Mutations in this gene may result in progressive familial intrahepatic cholestasis type 1 and in benign recurrent intrahepatic cholestasis. | ATPase phospholipid transporting 8B1 | 5205 | NA |
| TM4SF1 | ENSG00000169908 | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. This encoded protein is a cell surface antigen and is highly expressed in different carcinomas. | transmembrane 4 L six family member 1 | 4071 | NA |
| CSF3R | ENSG00000119535 | The protein encoded by this gene is the receptor for colony stimulating factor 3, a cytokine that controls the production, differentiation, and function of granulocytes. The encoded protein, which is a member of the family of cytokine receptors, may also function in some cell surface adhesion or recognition processes. Alternatively spliced transcript variants have been described. Mutations in this gene are a cause of Kostmann syndrome, also known as severe congenital neutropenia. | colony stimulating factor 3 receptor | 1441 | NA |
| DPT | ENSG00000143196 | Dermatopontin is an extracellular matrix protein with possible functions in cell-matrix interactions and matrix assembly. The protein is found in various tissues and many of its tyrosine residues are sulphated. Dermatopontin is postulated to modify the behavior of TGF-beta through interaction with decorin. | dermatopontin | 1805 | NA |
| GEM | ENSG00000164949 | The protein encoded by this gene belongs to the RAD/GEM family of GTP-binding proteins. It is associated with the inner face of the plasma membrane and could play a role as a regulatory protein in receptor-mediated signal transduction. Alternative splicing occurs at this locus and two transcript variants encoding the same protein have been identified. | GTP binding protein overexpressed in skeletal muscle | 2669 | NA |
| A4GALT | ENSG00000128274 | The protein encoded by this gene catalyzes the transfer of galactose to lactosylceramide to form globotriaosylceramide, which has been identified as the P(k) antigen of the P blood group system. This protein, a type II membrane protein found in the Golgi, is also required for the synthesis of the bacterial verotoxins receptor. Alternatively spliced transcript variants have been found for this gene. | alpha 1,4-galactosyltransferase | 53947 | NA |
| CD109 | ENSG00000156535 | This gene encodes a glycosyl phosphatidylinositol (GPI)-linked glycoprotein that localizes to the surface of platelets, activated T-cells, and endothelial cells. The protein binds to and negatively regulates signalling by transforming growth factor beta (TGF-beta). Multiple transcript variants encoding different isoforms have been found for this gene. | CD109 molecule | 135228 | NA |
| TNC | ENSG00000041982 | This gene encodes an extracellular matrix protein with a spatially and temporally restricted tissue distribution. This protein is homohexameric with disulfide-linked subunits, and contains multiple EGF-like and fibronectin type-III domains. It is implicated in guidance of migrating neurons as well as axons during development, synaptic plasticity, and neuronal regeneration. | tenascin C | 3371 | NA |
| FBLIM1 | ENSG00000162458 | This gene encodes a protein with an N-terminal filamin-binding domain, a central proline-rich domain, and, multiple C-terminal LIM domains. This protein localizes at cell junctions and may link cell adhesion structures to the actin cytoskeleton. This protein may be involved in the assembly and stabilization of actin-filaments and likely plays a role in modulating cell adhesion, cell morphology and cell motility. This protein also localizes to the nucleus and may affect cardiomyocyte differentiation after binding with the CSX/NKX2-5 transcription factor. Alternative splicing results in multiple transcript variants encoding different isoforms. | filamin binding LIM protein 1 | 54751 | NA |
| EMILIN1 | ENSG00000138080 | This gene encodes an extracellular matrix glycoprotein that is characterized by an N-terminal microfibril interface domain, a coiled-coiled alpha-helical domain, a collagenous domain and a C-terminal globular C1q domain. The encoded protein associates with elastic fibers at the interface between elastin and microfibrils and may play a role in the development of elastic tissues including large blood vessels, dermis, heart and lung. | elastin microfibril interfacer 1 | 11117 | NA |
| PPIC | ENSG00000168938 | The protein encoded by this gene is a member of the peptidyl-prolyl cis-trans isomerase (PPIase)) family. PPIases catalyze the cis-trans isomerization of proline imidic peptide bonds in oligopeptides and accelerate the folding of proteins. Similar to other PPIases, this protein can bind immunosuppressant cyclosporin A. | peptidylprolyl isomerase C | 5480 | NA |
| IGFBP4 | ENSG00000141753 | This gene is a member of the insulin-like growth factor binding protein (IGFBP) family and encodes a protein with an IGFBP domain and a thyroglobulin type-I domain. The protein binds both insulin-like growth factors (IGFs) I and II and circulates in the plasma in both glycosylated and non-glycosylated forms. Binding of this protein prolongs the half-life of the IGFs and alters their interaction with cell surface receptors. | insulin like growth factor binding protein 4 | 3487 | NA |
| BTG3 | ENSG00000154640 | The protein encoded by this gene is a member of the BTG/Tob family. This family has structurally related proteins that appear to have antiproliferative properties. This encoded protein might play a role in neurogenesis in the central nervous system. Two transcript variants encoding different isoforms have been found for this gene. | BTG family member 3 | 10950 | NA |
| MFAP4 | ENSG00000166482 | This gene encodes a protein with similarity to a bovine microfibril-associated protein. The protein has binding specificities for both collagen and carbohydrate. It is thought to be an extracellular matrix protein which is involved in cell adhesion or intercellular interactions. The gene is located within the Smith-Magenis syndrome region. Two transcript variants encoding different isoforms have been found for this gene. | microfibrillar associated protein 4 | 4239 | NA |
| TPBG | ENSG00000146242 | This gene encodes a leucine-rich transmembrane glycoprotein that may be involved in cell adhesion. The encoded protein is an oncofetal antigen that is specific to trophoblast cells. In adults this protein is highly expressed in many tumor cells and is associated with poor clinical outcome in numerous cancers. Alternate splicing in the 5’ UTR results in multiple transcript variants that encode the same protein. | trophoblast glycoprotein | 7162 | NA |
| PHLDA2 | ENSG00000181649 | This gene is located in a cluster of imprinted genes on chromosome 11p15.5, which is considered to be an important tumor suppressor gene region. Alterations in this region may be associated with the Beckwith-Wiedemann syndrome, Wilms tumor, rhabdomyosarcoma, adrenocortical carcinoma, and lung, ovarian, and breast cancer. This gene has been shown to be imprinted, with preferential expression from the maternal allele in placenta and liver. | pleckstrin homology like domain family A member 2 | 7262 | NA |
| COL5A1 | ENSG00000130635 | This gene encodes an alpha chain for one of the low abundance fibrillar collagens. Fibrillar collagen molecules are trimers that can be composed of one or more types of alpha chains. Type V collagen is found in tissues containing type I collagen and appears to regulate the assembly of heterotypic fibers composed of both type I and type V collagen. This gene product is closely related to type XI collagen and it is possible that the collagen chains of types V and XI constitute a single collagen type with tissue-specific chain combinations. The encoded procollagen protein occurs commonly as the heterotrimer pro-alpha1(V)-pro-alpha1(V)-pro-alpha2(V). Mutations in this gene are associated with Ehlers-Danlos syndrome, types I and II. Alternative splicing of this gene results in multiple transcript variants. | collagen type V alpha 1 | 1289 | NA |
| TOX2 | ENSG00000124191 | NA | TOX high mobility group box family member 2 | 84969 | NA |
| MMP23B | ENSG00000189409 | This gene (MMP23B) encodes a member of the matrix metalloproteinase (MMP) family, and it is part of a duplicated region of chromosome 1p36.3. Proteins of the matrix metalloproteinase (MMP) family are involved in the breakdown of extracellular matrix in normal physiological processes, such as embryonic development, reproduction, and tissue remodeling, as well as in disease processes, such as arthritis and metastasis. This gene belongs to the more telomeric copy of the duplicated region. | matrix metallopeptidase 23B | 8510 | NA |
| PPP1R1A | ENSG00000135447 | NA | protein phosphatase 1 regulatory inhibitor subunit 1A | 5502 | NA |
| PRL | ENSG00000172179 | This gene encodes the anterior pituitary hormone prolactin. This secreted hormone is a growth regulator for many tissues, including cells of the immune system. It may also play a role in cell survival by suppressing apoptosis, and it is essential for lactation. Alternative splicing results in multiple transcript variants that encode the same protein. | prolactin | 5617 | NA |
| COL15A1 | ENSG00000204291 | This gene encodes the alpha chain of type XV collagen, a member of the FACIT collagen family (fibril-associated collagens with interrupted helices). Type XV collagen has a wide tissue distribution but the strongest expression is localized to basement membrane zones so it may function to adhere basement membranes to underlying connective tissue stroma. The proteolytically produced C-terminal fragment of type XV collagen is restin, a potentially antiangiogenic protein that is closely related to endostatin. Mouse studies have shown that collagen XV deficiency is associated with muscle and microvessel deterioration. | collagen type XV alpha 1 chain | 1306 | NA |
| SNORA73B | ENSG00000200087 | NA | small nucleolar RNA, H/ACA box 73B | ENSG00000200087 | NA |
| IL12A | ENSG00000168811 | This gene encodes a subunit of a cytokine that acts on T and natural killer cells, and has a broad array of biological activities. The cytokine is a disulfide-linked heterodimer composed of the 35-kD subunit encoded by this gene, and a 40-kD subunit that is a member of the cytokine receptor family. This cytokine is required for the T-cell-independent induction of interferon (IFN)-gamma, and is important for the differentiation of both Th1 and Th2 cells. The responses of lymphocytes to this cytokine are mediated by the activator of transcription protein STAT4. Nitric oxide synthase 2A (NOS2A/NOS2) is found to be required for the signaling process of this cytokine in innate immunity. | interleukin 12A | 3592 | NA |
| ACKR3 | ENSG00000144476 | This gene encodes a member of the G-protein coupled receptor family. Although this protein was earlier thought to be a receptor for vasoactive intestinal peptide (VIP), it is now considered to be an orphan receptor, in that its endogenous ligand has not been identified. The protein is also a coreceptor for human immunodeficiency viruses (HIV). Translocations involving this gene and HMGA2 on chromosome 12 have been observed in lipomas. | atypical chemokine receptor 3 | 57007 | NA |
| RARRES2 | ENSG00000106538 | This gene encodes a secreted chemotactic protein that initiates chemotaxis via the ChemR23 G protein-coupled seven-transmembrane domain ligand. Expression of this gene is upregulated by the synthetic retinoid tazarotene and occurs in a wide variety of tissues. The active protein has several roles, including that as an adipokine and as an antimicrobial protein with activity against bacteria and fungi. | retinoic acid receptor responder 2 | 5919 | NA |
| COL5A2 | ENSG00000204262 | This gene encodes an alpha chain for one of the low abundance fibrillar collagens. Fibrillar collagen molecules are trimers that can be composed of one or more types of alpha chains. Type V collagen is found in tissues containing type I collagen and appears to regulate the assembly of heterotypic fibers composed of both type I and type V collagen. This gene product is closely related to type XI collagen and it is possible that the collagen chains of types V and XI constitute a single collagen type with tissue-specific chain combinations. Mutations in this gene are associated with Ehlers-Danlos syndrome, types I and II. | collagen type V alpha 2 chain | 1290 | NA |
| CSRP2 | ENSG00000175183 | CSRP2 is a member of the CSRP family of genes, encoding a group of LIM domain proteins, which may be involved in regulatory processes important for development and cellular differentiation. CRP2 contains two copies of the cysteine-rich amino acid sequence motif (LIM) with putative zinc-binding activity, and may be involved in regulating ordered cell growth. Other genes in the family include CSRP1 and CSRP3. Alternative splicing results in multiple transcript variants. | cysteine and glycine rich protein 2 | 1466 | NA |
| CNN3 | ENSG00000117519 | This gene encodes a protein with a markedly acidic C terminus; the basic N-terminus is highly homologous to the N-terminus of a related gene, CNN1. Members of the CNN gene family all contain similar tandemly repeated motifs. This encoded protein is associated with the cytoskeleton but is not involved in contraction. | calponin 3 | 1266 | NA |
| COL12A1 | ENSG00000111799 | This gene encodes the alpha chain of type XII collagen, a member of the FACIT (fibril-associated collagens with interrupted triple helices) collagen family. Type XII collagen is a homotrimer found in association with type I collagen, an association that is thought to modify the interactions between collagen I fibrils and the surrounding matrix. Alternatively spliced transcript variants encoding different isoforms have been identified. | collagen type XII alpha 1 chain | 1303 | NA |
| PDZRN3 | ENSG00000121440 | This gene encodes a member of the LNX (Ligand of Numb Protein-X) family of RING-type ubiquitin E3 ligases. This protein may function in vascular morphogenesis and the differentiation of adipocytes, osteoblasts and myoblasts. This protein may be targeted for degradation by the human papilloma virus E6 protein. Alternative splicing results in multiple transcript variants. | PDZ domain containing ring finger 3 | 23024 | NA |
| OLAH | ENSG00000152463 | NA | oleoyl-ACP hydrolase | 55301 | NA |
| NRARP | ENSG00000198435 | NA | NOTCH-regulated ankyrin repeat protein | 441478 | NA |
| ECE2 | ENSG00000145194 | This gene encodes a member of the M13 family, which includes type 2 integral membrane metallopeptidases. The encoded enzyme is a membrane-bound zinc-dependent metalloprotease. The enzyme catalyzes the cleavage of big endothelin to produce the vasoconstrictor endothelin-1, and plays a role in the processing of several neuroendocrine peptides. It may also have methyltransferase activity. Alternative splicing results in multiple transcript variants. | endothelin converting enzyme 2 | 9718 | NA |
| DLL1 | ENSG00000198719 | DLL1 is a human homolog of the Notch Delta ligand and is a member of the delta/serrate/jagged family. It plays a role in mediating cell fate decisions during hematopoiesis. It may play a role in cell-to-cell communication. | delta like canonical Notch ligand 1 | 28514 | NA |
| PTGS2 | ENSG00000073756 | Prostaglandin-endoperoxide synthase (PTGS), also known as cyclooxygenase, is the key enzyme in prostaglandin biosynthesis, and acts both as a dioxygenase and as a peroxidase. There are two isozymes of PTGS: a constitutive PTGS1 and an inducible PTGS2, which differ in their regulation of expression and tissue distribution. This gene encodes the inducible isozyme. It is regulated by specific stimulatory events, suggesting that it is responsible for the prostanoid biosynthesis involved in inflammation and mitogenesis. | prostaglandin-endoperoxide synthase 2 | 5743 | NA |
| NKD2 | ENSG00000145506 | This gene encodes a member of a family of proteins that function as negative regulators of Wnt receptor signaling through interaction with Dishevelled family members. The encoded protein participates in the delivery of transforming growth factor alpha-containing vesicles to the cell membrane. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | naked cuticle homolog 2 | 85409 | NA |
| P2RY12 | ENSG00000169313 | The product of this gene belongs to the family of G-protein coupled receptors. This family has several receptor subtypes with different pharmacological selectivity, which overlaps in some cases, for various adenosine and uridine nucleotides. This receptor is involved in platelet aggregation, and is a potential target for the treatment of thromboembolisms and other clotting disorders. Mutations in this gene are implicated in bleeding disorder, platelet type 8 (BDPLT8). Alternative splicing results in multiple transcript variants of this gene. | purinergic receptor P2Y12 | 64805 | NA |
| WWTR1-AS1 | ENSG00000241313 | NA | WWTR1 antisense RNA 1 | 100128025 | NA |
| CTD-2184D3.5 | ENSG00000259712 | NA | NA | ENSG00000259712 | NA |
| ATP2A1-AS1 | ENSG00000260442 | NA | ATP2A1 antisense RNA 1 | 100289092 | NA |
| FBLN1 | ENSG00000077942 | Fibulin 1 is a secreted glycoprotein that becomes incorporated into a fibrillar extracellular matrix. Calcium-binding is apparently required to mediate its binding to laminin and nidogen. It mediates platelet adhesion via binding fibrinogen. Four splice variants which differ in the 3’ end have been identified. Each variant encodes a different isoform, but no functional distinctions have been identified among the four variants. | fibulin 1 | 2192 | NA |
| PTGIR | ENSG00000160013 | The protein encoded by this gene is a member of the G-protein coupled receptor family 1 and has been shown to be a receptor for prostacyclin. Prostacyclin, the major product of cyclooxygenase in macrovascular endothelium, elicits a potent vasodilation and inhibition of platelet aggregation through binding to this receptor. | prostaglandin I2 (prostacyclin) receptor (IP) | 5739 | NA |
| VCAM1 | ENSG00000162692 | This gene is a member of the Ig superfamily and encodes a cell surface sialoglycoprotein expressed by cytokine-activated endothelium. This type I membrane protein mediates leukocyte-endothelial cell adhesion and signal transduction, and may play a role in the development of artherosclerosis and rheumatoid arthritis. Three alternatively spliced transcripts encoding different isoforms have been described for this gene. | vascular cell adhesion molecule 1 | 7412 | NA |
| DDIT4L | ENSG00000145358 | NA | DNA damage inducible transcript 4 like | 115265 | NA |
| C8orf4 | ENSG00000176907 | This gene encodes a small, monomeric, predominantly unstructured protein that functions as a positive regulator of the Wnt/beta-catenin signaling pathway. This protein interacts with a repressor of beta-catenin mediated transcription at nuclear speckles. It is thought to competitively block interactions of the repressor with beta-catenin, resulting in up-regulation of beta-catenin target genes. The encoded protein may also play a role in the NF-kappaB and ERK1/2 signaling pathways. Expression of this gene may play a role in the proliferation of several types of cancer including thyroid cancer, breast cancer and hematological malignancies. | chromosome 8 open reading frame 4 | 56892 | NA |
| IER3 | ENSG00000137331 | This gene functions in the protection of cells from Fas- or tumor necrosis factor type alpha-induced apoptosis. Partially degraded and unspliced transcripts are found after virus infection in vitro, but these transcripts are not found in vivo and do not generate a valid protein. | immediate early response 3 | 8870 | NA |
| CCDC102B | ENSG00000150636 | NA | coiled-coil domain containing 102B | 79839 | NA |
| SERPINA1 | ENSG00000197249 | The protein encoded by this gene is secreted and is a serine protease inhibitor whose targets include elastase, plasmin, thrombin, trypsin, chymotrypsin, and plasminogen activator. Defects in this gene can cause emphysema or liver disease. Several transcript variants encoding the same protein have been found for this gene. | serpin family A member 1 | 5265 | NA |
| TMEM266 | ENSG00000169758 | NA | transmembrane protein 266 | 123591 | NA |
| FILIP1L | ENSG00000168386 | NA | filamin A interacting protein 1 like | 11259 | NA |
| HES4 | ENSG00000188290 | NA | hes family bHLH transcription factor 4 | 57801 | NA |
| FBN1 | ENSG00000166147 | This gene encodes a member of the fibrillin family of proteins. The encoded preproprotein is proteolytically processed to generate two proteins including the extracellular matrix component fibrillin-1 and the protein hormone asprosin. Fibrillin-1 is an extracellular matrix glycoprotein that serves as a structural component of calcium-binding microfibrils. These microfibrils provide force-bearing structural support in elastic and nonelastic connective tissue throughout the body. Asprosin, secreted by white adipose tissue, has been shown to regulate glucose homeostasis. Mutations in this gene are associated with Marfan syndrome and the related MASS phenotype, as well as ectopia lentis syndrome, Weill-Marchesani syndrome, Shprintzen-Goldberg syndrome and neonatal progeroid syndrome. | fibrillin 1 | 2200 | NA |
| SHB | ENSG00000107338 | NA | SH2 domain containing adaptor protein B | 6461 | NA |
| PPFIBP1 | ENSG00000110841 | The protein encoded by this gene is a member of the LAR protein-tyrosine phosphatase-interacting protein (liprin) family. Liprins interact with members of LAR family of transmembrane protein tyrosine phosphatases, which are known to be important for axon guidance and mammary gland development. It has been proposed that liprins are multivalent proteins that form complex structures and act as scaffolds for the recruitment and anchoring of LAR family of tyrosine phosphatases. This protein was found to interact with S100A4, a calcium-binding protein related to tumor invasiveness and metastasis. In vitro experiment demonstrated that the interaction inhibited the phosphorylation of this protein by protein kinase C and protein kinase CK2. Alternatively spliced transcript variants encoding distinct isoforms have been reported. | PPFIA binding protein 1 | 8496 | NA |
| VASN | ENSG00000168140 | NA | vasorin | 114990 | NA |
| LUM | ENSG00000139329 | This gene encodes a member of the small leucine-rich proteoglycan (SLRP) family that includes decorin, biglycan, fibromodulin, keratocan, epiphycan, and osteoglycin. In these bifunctional molecules, the protein moiety binds collagen fibrils and the highly charged hydrophilic glycosaminoglycans regulate interfibrillar spacings. Lumican is the major keratan sulfate proteoglycan of the cornea but is also distributed in interstitial collagenous matrices throughout the body. Lumican may regulate collagen fibril organization and circumferential growth, corneal transparency, and epithelial cell migration and tissue repair. | lumican | 4060 | NA |
| CTD-3128G10.6 | ENSG00000269680 | NA | NA | ENSG00000269680 | NA |
| COL1A2 | ENSG00000164692 | This gene encodes the pro-alpha2 chain of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIB, recessive Ehlers-Danlos syndrome Classical type, idiopathic osteoporosis, and atypical Marfan syndrome. Symptoms associated with mutations in this gene, however, tend to be less severe than mutations in the gene for the alpha1 chain of type I collagen (COL1A1) reflecting the different role of alpha2 chains in matrix integrity. Three transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | collagen type I alpha 2 chain | 1278 | NA |
| NNAT | ENSG00000053438 | The protein encoded by this gene is a proteolipid that may be involved in the regulation of ion channels during brain development. The encoded protein may also play a role in forming and maintaining the structure of the nervous system. This gene is found within an intron of another gene, bladder cancer associated protein, but on the opposite strand. This gene is imprinted and is expressed only from the paternal allele. | neuronatin | 4826 | NA |
| MAPK12 | ENSG00000188130 | Activation of members of the mitogen-activated protein kinase family is a major mechanism for transduction of extracellular signals. Stress-activated protein kinases are one subclass of MAP kinases. The protein encoded by this gene functions as a signal transducer during differentiation of myoblasts to myotubes. | mitogen-activated protein kinase 12 | 6300 | NA |
| ADAMTS7 | ENSG00000136378 | The protein encoded by this gene is a member of the ADAMTS (a disintegrin and metalloproteinase with thrombospondin motifs) family. Members of this family share several distinct protein modules, including a propeptide region, a metalloproteinase domain, a disintegrin-like domain, and a thrombospondin type 1 (TS) motif. Individual members of this family differ in the number of C-terminal TS motifs, and some have unique C-terminal domains. The encoded preproprotein is proteolytically processed to generate the mature enzyme. This enzyme contains two C-terminal TS motifs and may regulate vascular smooth muscle cell (VSMC) migration. Mutations in this gene may be associated with susceptibility to coronary artery disease. | ADAM metallopeptidase with thrombospondin type 1 motif 7 | 11173 | NA |
| HES1 | ENSG00000114315 | This protein belongs to the basic helix-loop-helix family of transcription factors. It is a transcriptional repressor of genes that require a bHLH protein for their transcription. The protein has a particular type of basic domain that contains a helix interrupting protein that binds to the N-box rather than the canonical E-box. | hes family bHLH transcription factor 1 | 3280 | NA |
| RP11-359P5.1 | ENSG00000249996 | NA | NA | ENSG00000249996 | NA |
| MMP2 | ENSG00000087245 | This gene is a member of the matrix metalloproteinase (MMP) gene family, that are zinc-dependent enzymes capable of cleaving components of the extracellular matrix and molecules involved in signal transduction. The protein encoded by this gene is a gelatinase A, type IV collagenase, that contains three fibronectin type II repeats in its catalytic site that allow binding of denatured type IV and V collagen and elastin. Unlike most MMP family members, activation of this protein can occur on the cell membrane. This enzyme can be activated extracellularly by proteases, or, intracellulary by its S-glutathiolation with no requirement for proteolytical removal of the pro-domain. This protein is thought to be involved in multiple pathways including roles in the nervous system, endometrial menstrual breakdown, regulation of vascularization, and metastasis. Mutations in this gene have been associated with Winchester syndrome and Nodulosis-Arthropathy-Osteolysis (NAO) syndrome. Alternative splicing results in multiple transcript variants encoding different isoforms. | matrix metallopeptidase 2 | 4313 | NA |
| RP11-54O7.14 | ENSG00000242590 | NA | NA | ENSG00000242590 | NA |
| GPNMB | ENSG00000136235 | The protein encoded by this gene is a type I transmembrane glycoprotein which shows homology to the pMEL17 precursor, a melanocyte-specific protein. GPNMB shows expression in the lowly metastatic human melanoma cell lines and xenografts but does not show expression in the highly metastatic cell lines. GPNMB may be involved in growth delay and reduction of metastatic potential. Two transcript variants encoding different isoforms have been found for this gene. | glycoprotein nmb | 10457 | NA |
| AGRN | ENSG00000188157 | This gene encodes one of several proteins that are critical in the development of the neuromuscular junction (NMJ), as identified in mouse knock-out studies. The encoded protein contains several laminin G, Kazal type serine protease inhibitor, and epidermal growth factor domains. Additional post-translational modifications occur to add glycosaminoglycans and disulfide bonds. In one family with congenital myasthenic syndrome affecting limb-girdle muscles, a mutation in this gene was found. Alternative splicing results in multiple transcript variants encoding different isoforms. | agrin | 375790 | NA |
| GNG12 | ENSG00000172380 | NA | G protein subunit gamma 12 | 55970 | NA |
| COL3A1 | ENSG00000168542 | This gene encodes the pro-alpha1 chains of type III collagen, a fibrillar collagen that is found in extensible connective tissues such as skin, lung, uterus, intestine and the vascular system, frequently in association with type I collagen. Mutations in this gene are associated with Ehlers-Danlos syndrome types IV, and with aortic and arterial aneurysms. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | collagen type III alpha 1 chain | 1281 | NA |
| SEMA3F | ENSG00000001617 | This gene encodes a member of the semaphorin III family of secreted signaling proteins that are involved in axon guidance during neuronal development. The encoded protein contains an N-terminal Sema domain, an immunoglobulin loop and a C-terminal basic domain. This gene is expressed by the endothelial cells where it was found to act in an autocrine fashion to induce apoptosis, inhibit cell proliferation and survival, and function as an anti-tumorigenic agent. Alternative splicing results in multiple transcript variants encoding different isoforms. | semaphorin 3F | 6405 | NA |
| WWTR1 | ENSG00000018408 | NA | WW domain containing transcription regulator 1 | 25937 | NA |
| PDLIM1 | ENSG00000107438 | This gene encodes a member of the enigma protein family. The protein contains two protein interacting domains, a PDZ domain at the amino terminal end and one to three LIM domains at the carboxyl terminal. It is a cytoplasmic protein associated with the cytoskeleton. The protein may function as an adapter to bring other LIM-interacting proteins to the cytoskeleton. Pseudogenes associated with this gene are located on chromosomes 3, 14 and 17. | PDZ and LIM domain 1 | 9124 | NA |
| KCNJ12 | ENSG00000184185 | This gene encodes an inwardly rectifying K+ channel which may be blocked by divalent cations. This protein is thought to be one of multiple inwardly rectifying channels which contribute to the cardiac inward rectifier current (IK1). The gene is located within the Smith-Magenis syndrome region on chromosome 17. | potassium voltage-gated channel subfamily J member 12 | 3768 | NA |
| NA | ENSG00000255905 | NA | NA | NA | TRUE |
| UGDH | ENSG00000109814 | The protein encoded by this gene converts UDP-glucose to UDP-glucuronate and thereby participates in the biosynthesis of glycosaminoglycans such as hyaluronan, chondroitin sulfate, and heparan sulfate. These glycosylated compounds are common components of the extracellular matrix and likely play roles in signal transduction, cell migration, and cancer growth and metastasis. The expression of this gene is up-regulated by transforming growth factor beta and down-regulated by hypoxia. Alternative splicing results in multiple transcript variants. | UDP-glucose 6-dehydrogenase | 7358 | NA |
| SMOC2 | ENSG00000112562 | This gene encodes a member of the SPARC family (secreted protein acidic and rich in cysteine/osteonectin/BM-40), which are highly expressed during embryogenesis and wound healing. The gene product is a matricellular protein which promotes matrix assembly and can stimulate endothelial cell proliferation and migration, as well as angiogenic activity. Associated with pulmonary function, this secretory gene product contains a Kazal domain, two thymoglobulin type-1 domains, and two EF-hand calcium-binding domains. The encoded protein may serve as a target for controlling angiogenesis in tumor growth and myocardial ischemia. Alternative splicing results in multiple transcript variants. | SPARC related modular calcium binding 2 | 64094 | NA |
| RBP1 | ENSG00000114115 | This gene encodes the carrier protein involved in the transport of retinol (vitamin A alcohol) from the liver storage site to peripheral tissue. Vitamin A is a fat-soluble vitamin necessary for growth, reproduction, differentiation of epithelial tissues, and vision. Multiple transcript variants encoding different isoforms have been found for this gene. | retinol binding protein 1 | 5947 | NA |
| RP5-906A24.2 | ENSG00000266101 | NA | NA | ENSG00000266101 | NA |
| NEURL1 | ENSG00000107954 | NA | neuralized E3 ubiquitin protein ligase 1 | 9148 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_load_voom/gene_names_clus_",20,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
Transpose the matrix to ensure sparse factors
## Voom counts transpose
#Expected complete Log likelihood at iteration 100: -8.18694e+07
#Marginal log likelihood at iteration 100: inf
#Residual variance at iteration 100: 1.72648
#Residual sum of squares at iteration 100: 6.53386e+07
## Sqrt counts transpose
# Expected complete Log likelihood at iteration 100: -3.9495e+08
# Marginal log likelihood at iteration 100: -inf
# Residual variance at iteration 100: 638.648
# Residual sum of squares at iteration 100: 2.35575e+10
## counts transpose
lambda_out <- read.table("../sfa_outputs/GTEX2013_transpose/sqrt_counts_gtex/gtex_sqrt_counts_transpose_lambda.out");
f_out <- read.table("../sfa_outputs/GTEX2013_transpose/sqrt_counts_gtex/gtex_sqrt_counts_transpose_F.out");
gene_names <- as.vector(as.matrix(read.table("../sfa_inputs/gene_names_GTEX_V6.txt")));
gene_names <- substring(gene_names,1,15);
xli <- gene_names;
indices_mat <- SFA.ExtractTopFeatures(lambda_out, top_features = 100, options="min", mult.annotate = TRUE)
gene_list <- do.call(rbind, lapply(1:dim(indices_mat)[1], function(x) gene_names[indices_mat[x,]]))
out <- mygene::queryMany(gene_list[1,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| name | query | symbol | summary | X_id | notfound |
|---|---|---|---|---|---|
| myosin, heavy chain 11, smooth muscle | ENSG00000133392 | MYH11 | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | 4629 | NA |
| decorin | ENSG00000011465 | DCN | This gene encodes a member of the small leucine-rich proteoglycan family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature protein. This protein plays a role in collagen fibril assembly. Binding of this protein to multiple cell surface receptors mediates its role in tumor suppression, including a stimulatory effect on autophagy and inflammation and an inhibitory effect on angiogenesis and tumorigenesis. This gene and the related gene biglycan are thought to be the result of a gene duplication. Mutations in this gene are associated with congenital stromal corneal dystrophy in human patients. | 1634 | NA |
| actin, beta | ENSG00000075624 | ACTB | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | 60 | NA |
| collagen type VI alpha 3 chain | ENSG00000163359 | COL6A3 | This gene encodes the alpha-3 chain, one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The alpha-3 chain of type VI collagen is much larger than the alpha-1 and -2 chains. This difference in size is largely due to an increase in the number of subdomains, similar to von Willebrand Factor type A domains, that are found in the amino terminal globular domain of all the alpha chains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in the type VI collagen genes are associated with Bethlem myopathy, a rare autosomal dominant proximal myopathy with early childhood onset. Mutations in this gene are also a cause of Ullrich congenital muscular dystrophy, also referred to as Ullrich scleroatonic muscular dystrophy, an autosomal recessive congenital myopathy that is more severe than Bethlem myopathy. Multiple transcript variants have been identified, but the full-length nature of only some of these variants has been described. | 1293 | NA |
| secreted protein acidic and cysteine rich | ENSG00000113140 | SPARC | This gene encodes a cysteine-rich acidic matrix-associated protein. The encoded protein is required for the collagen in bone to become calcified but is also involved in extracellular matrix synthesis and promotion of changes to cell shape. The gene product has been associated with tumor suppression but has also been correlated with metastasis based on changes to cell shape which can promote tumor cell invasion. Three transcript variants encoding different isoforms have been found for this gene. | 6678 | NA |
| lymphocyte cytosolic protein 1 | ENSG00000136167 | LCP1 | Plastins are a family of actin-binding proteins that are conserved throughout eukaryote evolution and expressed in most tissues of higher eukaryotes. In humans, two ubiquitous plastin isoforms (L and T) have been identified. Plastin 1 (otherwise known as Fimbrin) is a third distinct plastin isoform which is specifically expressed at high levels in the small intestine. The L isoform is expressed only in hemopoietic cell lineages, while the T isoform has been found in all other normal cells of solid tissues that have replicative potential (fibroblasts, endothelial cells, epithelial cells, melanocytes, etc.). However, L-plastin has been found in many types of malignant human cells of non-hemopoietic origin suggesting that its expression is induced accompanying tumorigenesis in solid tissues. | 3936 | NA |
| lysosomal protein transmembrane 5 | ENSG00000162511 | LAPTM5 | This gene encodes a transmembrane receptor that is associated with lysosomes. The encoded protein, also known as E3 protein, may play a role in hematopoiesis. | 7805 | NA |
| cysteine and glycine rich protein 1 | ENSG00000159176 | CSRP1 | This gene encodes a member of the cysteine-rich protein (CSRP) family. This gene family includes a group of LIM domain proteins, which may be involved in regulatory processes important for development and cellular differentiation. The LIM/double zinc-finger motif found in this gene product occurs in proteins with critical functions in gene regulation, cell growth, and somatic differentiation. Alternatively spliced transcript variants have been described. | 1465 | NA |
| pentraxin 3 | ENSG00000163661 | PTX3 | NA | 5806 | NA |
| apolipoprotein D | ENSG00000189058 | APOD | This gene encodes a component of high density lipoprotein that has no marked similarity to other apolipoprotein sequences. It has a high degree of homology to plasma retinol-binding protein and other members of the alpha 2 microglobulin protein superfamily of carrier proteins, also known as lipocalins. This glycoprotein is closely associated with the enzyme lecithin:cholesterol acyltransferase - an enzyme involved in lipoprotein metabolism. | 347 | NA |
| beta-2-microglobulin | ENSG00000166710 | B2M | This gene encodes a serum protein found in association with the major histocompatibility complex (MHC) class I heavy chain on the surface of nearly all nucleated cells. The protein has a predominantly beta-pleated sheet structure that can form amyloid fibrils in some pathological conditions. The encoded antimicrobial protein displays antibacterial activity in amniotic fluid. A mutation in this gene has been shown to result in hypercatabolic hypoproteinemia. | 567 | NA |
| actin, gamma 2, smooth muscle, enteric | ENSG00000163017 | ACTG2 | Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | 72 | NA |
| CD53 molecule | ENSG00000143119 | CD53 | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. This encoded protein is a cell surface glycoprotein that is known to complex with integrins. It contributes to the transduction of CD2-generated signals in T cells and natural killer cells and has been suggested to play a role in growth regulation. Familial deficiency of this gene has been linked to an immunodeficiency associated with recurrent infectious diseases caused by bacteria, fungi and viruses. Alternative splicing results in multiple transcript variants. | 963 | NA |
| collagen type I alpha 2 chain | ENSG00000164692 | COL1A2 | This gene encodes the pro-alpha2 chain of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIB, recessive Ehlers-Danlos syndrome Classical type, idiopathic osteoporosis, and atypical Marfan syndrome. Symptoms associated with mutations in this gene, however, tend to be less severe than mutations in the gene for the alpha1 chain of type I collagen (COL1A1) reflecting the different role of alpha2 chains in matrix integrity. Three transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | 1278 | NA |
| transgelin | ENSG00000149591 | TAGLN | The protein encoded by this gene is a transformation and shape-change sensitive actin cross-linking/gelling protein found in fibroblasts and smooth muscle. Its expression is down-regulated in many cell lines, and this down-regulation may be an early and sensitive marker for the onset of transformation. A functional role of this protein is unclear. Two transcript variants encoding the same protein have been found for this gene. | 6876 | NA |
| CD248 molecule | ENSG00000174807 | CD248 | NA | 57124 | NA |
| sortilin-related receptor, L(DLR class) A repeats containing | ENSG00000137642 | SORL1 | This gene encodes a mosaic protein that belongs to at least two families: the vacuolar protein sorting 10 (VPS10) domain-containing receptor family, and the low density lipoprotein receptor (LDLR) family. The encoded protein also contains fibronectin type III repeats and an epidermal growth factor repeat. The encoded preproprotein is proteolytically processed to generate the mature receptor, which likely plays roles in endocytosis and sorting. Mutations in this gene may be associated with Alzheimer’s disease. | 6653 | NA |
| collagen type XII alpha 1 chain | ENSG00000111799 | COL12A1 | This gene encodes the alpha chain of type XII collagen, a member of the FACIT (fibril-associated collagens with interrupted triple helices) collagen family. Type XII collagen is a homotrimer found in association with type I collagen, an association that is thought to modify the interactions between collagen I fibrils and the surrounding matrix. Alternatively spliced transcript variants encoding different isoforms have been identified. | 1303 | NA |
| crystallin alpha B | ENSG00000109846 | CRYAB | Mammalian lens crystallins are divided into alpha, beta, and gamma families. Alpha crystallins are composed of two gene products: alpha-A and alpha-B, for acidic and basic, respectively. Alpha crystallins can be induced by heat shock and are members of the small heat shock protein (HSP20) family. They act as molecular chaperones although they do not renature proteins and release them in the fashion of a true chaperone; instead they hold them in large soluble aggregates. Post-translational modifications decrease the ability to chaperone. These heterogeneous aggregates consist of 30-40 subunits; the alpha-A and alpha-B subunits have a 3:1 ratio, respectively. Two additional functions of alpha crystallins are an autokinase activity and participation in the intracellular architecture. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. Alpha-A and alpha-B gene products are differentially expressed; alpha-A is preferentially restricted to the lens and alpha-B is expressed widely in many tissues and organs. Elevated expression of alpha-B crystallin occurs in many neurological diseases; a missense mutation cosegregated in a family with a desmin-related myopathy. Alternative splicing results in multiple transcript variants. | 1410 | NA |
| actin, alpha 1, skeletal muscle | ENSG00000143632 | ACTA1 | The product encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Mutations in this gene cause nemaline myopathy type 3, congenital myopathy with excess of thin myofilaments, congenital myopathy with cores, and congenital myopathy with fiber-type disproportion, diseases that lead to muscle fiber defects. | 58 | NA |
| leiomodin 1 | ENSG00000163431 | LMOD1 | The leiomodin 1 protein has a putative membrane-spanning region and 2 types of tandemly repeated blocks. The transcript is expressed in all tissues tested, with the highest levels in thyroid, eye muscle, skeletal muscle, and ovary. Increased expression of leiomodin 1 may be linked to Graves’ disease and thyroid-associated ophthalmopathy. | 25802 | NA |
| coronin 1A | ENSG00000102879 | CORO1A | This gene encodes a member of the WD repeat protein family. WD repeats are minimally conserved regions of approximately 40 amino acids typically bracketed by gly-his and trp-asp (GH-WD), which may facilitate formation of heterotrimeric or multiprotein complexes. Members of this family are involved in a variety of cellular processes, including cell cycle progression, signal transduction, apoptosis, and gene regulation. Alternative splicing results in multiple transcript variants. A related pseudogene has been defined on chromosome 16. | 11151 | NA |
| NDRG family member 4 | ENSG00000103034 | NDRG4 | This gene is a member of the N-myc downregulated gene family which belongs to the alpha/beta hydrolase superfamily. The protein encoded by this gene is a cytoplasmic protein that is required for cell cycle progression and survival in primary astrocytes and may be involved in the regulation of mitogenic signalling in vascular smooth muscles cells. Alternative splicing results in multiple transcripts encoding different isoforms. | 65009 | NA |
| protein tyrosine phosphatase, non-receptor type 6 | ENSG00000111679 | PTPN6 | The protein encoded by this gene is a member of the protein tyrosine phosphatase (PTP) family. PTPs are known to be signaling molecules that regulate a variety of cellular processes including cell growth, differentiation, mitotic cycle, and oncogenic transformation. N-terminal part of this PTP contains two tandem Src homolog (SH2) domains, which act as protein phospho-tyrosine binding domains, and mediate the interaction of this PTP with its substrates. This PTP is expressed primarily in hematopoietic cells, and functions as an important regulator of multiple signaling pathways in hematopoietic cells. This PTP has been shown to interact with, and dephosphorylate a wide spectrum of phospho-proteins involved in hematopoietic cell signaling. Multiple alternatively spliced variants of this gene, which encode distinct isoforms, have been reported. | 5777 | NA |
| major histocompatibility complex, class I, C | ENSG00000204525 | HLA-C | HLA-C belongs to the HLA class I heavy chain paralogues. This class I molecule is a heterodimer consisting of a heavy chain and a light chain (beta-2 microglobulin). The heavy chain is anchored in the membrane. Class I molecules play a central role in the immune system by presenting peptides derived from endoplasmic reticulum lumen. They are expressed in nearly all cells. The heavy chain is approximately 45 kDa and its gene contains 8 exons. Exon one encodes the leader peptide, exons 2 and 3 encode the alpha1 and alpha2 domain, which both bind the peptide, exon 4 encodes the alpha3 domain, exon 5 encodes the transmembrane region, and exons 6 and 7 encode the cytoplasmic tail. Polymorphisms within exon 2 and exon 3 are responsible for the peptide binding specificity of each class one molecule. Typing for these polymorphisms is routinely done for bone marrow and kidney transplantation. Over one hundred HLA-C alleles have been described | 3107 | NA |
| serglycin | ENSG00000122862 | SRGN | This gene encodes a protein best known as a hematopoietic cell granule proteoglycan. Proteoglycans stored in the secretory granules of many hematopoietic cells also contain a protease-resistant peptide core, which may be important for neutralizing hydrolytic enzymes. This encoded protein was found to be associated with the macromolecular complex of granzymes and perforin, which may serve as a mediator of granule-mediated apoptosis. Two transcript variants, only one of them protein-coding, have been found for this gene. | 5552 | NA |
| epithelial membrane protein 1 | ENSG00000134531 | EMP1 | NA | 2012 | NA |
| matrix metallopeptidase 2 | ENSG00000087245 | MMP2 | This gene is a member of the matrix metalloproteinase (MMP) gene family, that are zinc-dependent enzymes capable of cleaving components of the extracellular matrix and molecules involved in signal transduction. The protein encoded by this gene is a gelatinase A, type IV collagenase, that contains three fibronectin type II repeats in its catalytic site that allow binding of denatured type IV and V collagen and elastin. Unlike most MMP family members, activation of this protein can occur on the cell membrane. This enzyme can be activated extracellularly by proteases, or, intracellulary by its S-glutathiolation with no requirement for proteolytical removal of the pro-domain. This protein is thought to be involved in multiple pathways including roles in the nervous system, endometrial menstrual breakdown, regulation of vascularization, and metastasis. Mutations in this gene have been associated with Winchester syndrome and Nodulosis-Arthropathy-Osteolysis (NAO) syndrome. Alternative splicing results in multiple transcript variants encoding different isoforms. | 4313 | NA |
| LDL receptor related protein 1 | ENSG00000123384 | LRP1 | This gene encodes a member of the low-density lipoprotein receptor family of proteins. The encoded preproprotein is proteolytically processed by furin to generate 515 kDa and 85 kDa subunits that form the mature receptor (PMID: 8546712). This receptor is involved in several cellular processes, including intracellular signaling, lipid homeostasis, and clearance of apoptotic cells. In addition, the encoded protein is necessary for the alpha 2-macroglobulin-mediated clearance of secreted amyloid precursor protein and beta-amyloid, the main component of amyloid plaques found in Alzheimer patients. Expression of this gene decreases with age and has been found to be lower than controls in brain tissue from Alzheimer’s disease patients. | 4035 | NA |
| adrenomedullin | ENSG00000148926 | ADM | The protein encoded by this gene is a preprohormone which is cleaved to form two biologically active peptides, adrenomedullin and proadrenomedullin N-terminal 20 peptide. Adrenomedullin is a 52 aa peptide with several functions, including vasodilation, regulation of hormone secretion, promotion of angiogenesis, and antimicrobial activity. The antimicrobial activity is antibacterial, as the peptide has been shown to kill E. coli and S. aureus at low concentration. | 133 | NA |
| calponin 1 | ENSG00000130176 | CNN1 | NA | 1264 | NA |
| actin binding LIM protein 1 | ENSG00000099204 | ABLIM1 | This gene encodes a cytoskeletal LIM protein that binds to actin filaments via a domain that is homologous to erythrocyte dematin. LIM domains, found in over 60 proteins, play key roles in the regulation of developmental pathways. LIM domains also function as protein-binding interfaces, mediating specific protein-protein interactions. The protein encoded by this gene could mediate such interactions between actin filaments and cytoplasmic targets. Alternatively spliced transcript variants encoding different isoforms have been identified. | 3983 | NA |
| fibronectin 1 | ENSG00000115414 | FN1 | This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | 2335 | NA |
| lipopolysaccharide induced TNF factor | ENSG00000189067 | LITAF | Lipopolysaccharide is a potent stimulator of monocytes and macrophages, causing secretion of tumor necrosis factor-alpha (TNF-alpha) and other inflammatory mediators. This gene encodes lipopolysaccharide-induced TNF-alpha factor, which is a DNA-binding protein and can mediate the TNF-alpha expression by direct binding to the promoter region of the TNF-alpha gene. The transcription of this gene is induced by tumor suppressor p53 and has been implicated in the p53-induced apoptotic pathway. Mutations in this gene cause Charcot-Marie-Tooth disease type 1C (CMT1C) and may be involved in the carcinogenesis of extramammary Paget’s disease (EMPD). Multiple alternatively spliced transcript variants have been found for this gene. | 9516 | NA |
| complement factor D | ENSG00000197766 | CFD | This gene encodes a member of the S1, or chymotrypsin, family of serine peptidases. This protease catalyzes the cleavage of factor B, the rate-limiting step of the alternative pathway of complement activation. This protein also functions as an adipokine, a cell signaling protein secreted by adipocytes, which regulates insulin secretion in mice. Mutations in this gene underlie complement factor D deficiency, which is associated with recurrent bacterial meningitis infections in human patients. Alternative splicing of this gene results in multiple transcript variants. At least one of these variants encodes a preproprotein that is proteolytically processed to generate the mature protease. | 1675 | NA |
| sparc/osteonectin, cwcv and kazal-like domains proteoglycan (testican) 1 | ENSG00000152377 | SPOCK1 | This gene encodes the protein core of a seminal plasma proteoglycan containing chondroitin- and heparan-sulfate chains. The protein’s function is unknown, although similarity to thyropin-type cysteine protease-inhibitors suggests its function may be related to protease inhibition. | 6695 | NA |
| murine retrovirus integration site 1 homolog | ENSG00000072952 | MRVI1 | This gene is similar to a putative mouse tumor suppressor gene (Mrvi1) that is frequently disrupted by mouse AIDS-related virus (MRV). The encoded protein, which is found in the membrane of the endoplasmic reticulum, is similar to Jaw1, a lymphoid-restricted protein whose expression is down-regulated during lymphoid differentiation. This protein is a substrate of cGMP-dependent kinase-1 (PKG1) that can function as a regulator of IP3-induced calcium release. Studies in mouse suggest that MRV integration at Mrvi1 induces myeloid leukemia by altering the expression of a gene important for myeloid cell growth and/or differentiation, and thus this gene may function as a myeloid leukemia tumor suppressor gene. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene, and alternative translation start sites, including a non-AUG (CUG) start site, are used. | 10335 | NA |
| LIM domain containing preferred translocation partner in lipoma | ENSG00000145012 | LPP | This gene encodes a member of a subfamily of LIM domain proteins that are characterized by an N-terminal proline-rich region and three C-terminal LIM domains. The encoded protein localizes to the cell periphery in focal adhesions and may be involved in cell-cell adhesion and cell motility. This protein also shuttles through the nucleus and may function as a transcriptional co-activator. This gene is located at the junction of certain disease-related chromosomal translocations, which result in the expression of chimeric proteins that may promote tumor growth. Alternative splicing results in multiple transcript variants. | 4026 | NA |
| coiled-coil domain containing 80 | ENSG00000091986 | CCDC80 | NA | 151887 | NA |
| hematopoietic cell-specific Lyn substrate 1 | ENSG00000180353 | HCLS1 | NA | 3059 | NA |
| polymerase I and transcript release factor | ENSG00000177469 | PTRF | This gene encodes a protein that enables the dissociation of paused ternary polymerase I transcription complexes from the 3’ end of pre-rRNA transcripts. This protein regulates rRNA transcription by promoting the dissociation of transcription complexes and the reinitiation of polymerase I on nascent rRNA transcripts. This protein also localizes to caveolae at the plasma membrane and is thought to play a critical role in the formation of caveolae and the stabilization of caveolins. This protein translocates from caveolae to the cytoplasm after insulin stimulation. Caveolae contain truncated forms of this protein and may be the site of phosphorylation-dependent proteolysis. This protein is also thought to modify lipid metabolism and insulin-regulated gene expression. Mutations in this gene result in a disorder characterized by generalized lipodystrophy and muscular dystrophy. | 284119 | NA |
| lumican | ENSG00000139329 | LUM | This gene encodes a member of the small leucine-rich proteoglycan (SLRP) family that includes decorin, biglycan, fibromodulin, keratocan, epiphycan, and osteoglycin. In these bifunctional molecules, the protein moiety binds collagen fibrils and the highly charged hydrophilic glycosaminoglycans regulate interfibrillar spacings. Lumican is the major keratan sulfate proteoglycan of the cornea but is also distributed in interstitial collagenous matrices throughout the body. Lumican may regulate collagen fibril organization and circumferential growth, corneal transparency, and epithelial cell migration and tissue repair. | 4060 | NA |
| pleckstrin and Sec7 domain containing 4 | ENSG00000125637 | PSD4 | NA | 23550 | NA |
| FGR proto-oncogene, Src family tyrosine kinase | ENSG00000000938 | FGR | This gene is a member of the Src family of protein tyrosine kinases (PTKs). The encoded protein contains N-terminal sites for myristylation and palmitylation, a PTK domain, and SH2 and SH3 domains which are involved in mediating protein-protein interactions with phosphotyrosine-containing and proline-rich motifs, respectively. The protein localizes to plasma membrane ruffles, and functions as a negative regulator of cell migration and adhesion triggered by the beta-2 integrin signal transduction pathway. Infection with Epstein-Barr virus results in the overexpression of this gene. Multiple alternatively spliced variants, encoding the same protein, have been identified. | 2268 | NA |
| gelsolin | ENSG00000148180 | GSN | The protein encoded by this gene binds to the ‘plus’ ends of actin monomers and filaments to prevent monomer exchange. The encoded calcium-regulated protein functions in both assembly and disassembly of actin filaments. Defects in this gene are a cause of familial amyloidosis Finnish type (FAF). Multiple transcript variants encoding several different isoforms have been found for this gene. | 2934 | NA |
| myosin light chain 9 | ENSG00000101335 | MYL9 | Myosin, a structural component of muscle, consists of two heavy chains and four light chains. The protein encoded by this gene is a myosin light chain that may regulate muscle contraction by modulating the ATPase activity of myosin heads. The encoded protein binds calcium and is activated by myosin light chain kinase. Two transcript variants encoding different isoforms have been found for this gene. | 10398 | NA |
| NA | ENSG00000263335 | AF001548.5 | NA | ENSG00000263335 | NA |
| Ras association domain family member 2 | ENSG00000101265 | RASSF2 | This gene encodes a protein that contains a Ras association domain. Similar to its cattle and sheep counterparts, this gene is located near the prion gene. Two alternatively spliced transcripts encoding the same isoform have been reported. | 9770 | NA |
| cathepsin K | ENSG00000143387 | CTSK | The protein encoded by this gene is a lysosomal cysteine proteinase involved in bone remodeling and resorption. This protein, which is a member of the peptidase C1 protein family, is predominantly expressed in osteoclasts. However, the encoded protein is also expressed in a significant fraction of human breast cancers, where it could contribute to tumor invasiveness. Mutations in this gene are the cause of pycnodysostosis, an autosomal recessive disease characterized by osteosclerosis and short stature. | 1513 | NA |
| laminin subunit beta 1 | ENSG00000091136 | LAMB1 | Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Laminins are composed of 3 non identical chains: laminin alpha, beta and gamma (formerly A, B1, and B2, respectively) and they form a cruciform structure consisting of 3 short arms, each formed by a different chain, and a long arm composed of all 3 chains. Each laminin chain is a multidomain protein encoded by a distinct gene. Several isoforms of each chain have been described. Different alpha, beta and gamma chain isomers combine to give rise to different heterotrimeric laminin isoforms which are designated by Arabic numerals in the order of their discovery, i.e. alpha1beta1gamma1 heterotrimer is laminin 1. The biological functions of the different chains and trimer molecules are largely unknown, but some of the chains have been shown to differ with respect to their tissue distribution, presumably reflecting diverse functions in vivo. This gene encodes the beta chain isoform laminin, beta 1. The beta 1 chain has 7 structurally distinct domains which it shares with other beta chain isomers. The C-terminal helical region containing domains I and II are separated by domain alpha, domains III and V contain several EGF-like repeats, and domains IV and VI have a globular conformation. Laminin, beta 1 is expressed in most tissues that produce basement membranes, and is one of the 3 chains constituting laminin 1, the first laminin isolated from Engelbreth-Holm-Swarm (EHS) tumor. A sequence in the beta 1 chain that is involved in cell attachment, chemotaxis, and binding to the laminin receptor was identified and shown to have the capacity to inhibit metastasis. | 3912 | NA |
| ras-related C3 botulinum toxin substrate 2 (rho family, small GTP binding protein Rac2) | ENSG00000128340 | RAC2 | This gene encodes a member of the Ras superfamily of small guanosine triphosphate (GTP)-metabolizing proteins. The encoded protein localizes to the plasma membrane, where it regulates diverse processes, such as secretion, phagocytosis, and cell polarization. Activity of this protein is also involved in the generation of reactive oxygen species. Mutations in this gene are associated with neutrophil immunodeficiency syndrome. There is a pseudogene for this gene on chromosome 6. | 5880 | NA |
| integrin subunit beta 2 | ENSG00000160255 | ITGB2 | This gene encodes an integrin beta chain, which combines with multiple different alpha chains to form different integrin heterodimers. Integrins are integral cell-surface proteins that participate in cell adhesion as well as cell-surface mediated signalling. The encoded protein plays an important role in immune response and defects in this gene cause leukocyte adhesion deficiency. Alternative splicing results in multiple transcript variants. | 3689 | NA |
| lamin A/C | ENSG00000160789 | LMNA | The nuclear lamina consists of a two-dimensional matrix of proteins located next to the inner nuclear membrane. The lamin family of proteins make up the matrix and are highly conserved in evolution. During mitosis, the lamina matrix is reversibly disassembled as the lamin proteins are phosphorylated. Lamin proteins are thought to be involved in nuclear stability, chromatin structure and gene expression. Vertebrate lamins consist of two types, A and B. Alternative splicing results in multiple transcript variants. Mutations in this gene lead to several diseases: Emery-Dreifuss muscular dystrophy, familial partial lipodystrophy, limb girdle muscular dystrophy, dilated cardiomyopathy, Charcot-Marie-Tooth disease, and Hutchinson-Gilford progeria syndrome. | 4000 | NA |
| neuropilin 1 | ENSG00000099250 | NRP1 | This gene encodes one of two neuropilins, which contain specific protein domains which allow them to participate in several different types of signaling pathways that control cell migration. Neuropilins contain a large N-terminal extracellular domain, made up of complement-binding, coagulation factor V/VIII, and meprin domains. These proteins also contains a short membrane-spanning domain and a small cytoplasmic domain. Neuropilins bind many ligands and various types of co-receptors; they affect cell survival, migration, and attraction. Some of the ligands and co-receptors bound by neuropilins are vascular endothelial growth factor (VEGF) and semaphorin family members. Several alternatively spliced transcript variants that encode different protein isoforms have been described for this gene. | 8829 | NA |
| NA | ENSG00000259716 | NA | NA | NA | TRUE |
| laminin subunit gamma 1 | ENSG00000135862 | LAMC1 | Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Laminins, composed of 3 non identical chains: laminin alpha, beta and gamma (formerly A, B1, and B2, respectively), have a cruciform structure consisting of 3 short arms, each formed by a different chain, and a long arm composed of all 3 chains. Each laminin chain is a multidomain protein encoded by a distinct gene. Several isoforms of each chain have been described. Different alpha, beta and gamma chain isomers combine to give rise to different heterotrimeric laminin isoforms which are designated by Arabic numerals in the order of their discovery, i.e. alpha1beta1gamma1 heterotrimer is laminin 1. The biological functions of the different chains and trimer molecules are largely unknown, but some of the chains have been shown to differ with respect to their tissue distribution, presumably reflecting diverse functions in vivo. This gene encodes the gamma chain isoform laminin, gamma 1. The gamma 1 chain, formerly thought to be a beta chain, contains structural domains similar to beta chains, however, lacks the short alpha region separating domains I and II. The structural organization of this gene also suggested that it had diverged considerably from the beta chain genes. Embryos of transgenic mice in which both alleles of the gamma 1 chain gene were inactivated by homologous recombination, lacked basement membranes, indicating that laminin, gamma 1 chain is necessary for laminin heterotrimer assembly. It has been inferred by analogy with the strikingly similar 3’ UTR sequence in mouse laminin gamma 1 cDNA, that multiple polyadenylation sites are utilized in human to generate the 2 different sized mRNAs (5.5 and 7.5 kb) seen on Northern analysis. | 3915 | NA |
| insulin like growth factor binding protein 3 | ENSG00000146674 | IGFBP3 | This gene is a member of the insulin-like growth factor binding protein (IGFBP) family and encodes a protein with an IGFBP domain and a thyroglobulin type-I domain. The protein forms a ternary complex with insulin-like growth factor acid-labile subunit (IGFALS) and either insulin-like growth factor (IGF) I or II. In this form, it circulates in the plasma, prolonging the half-life of IGFs and altering their interaction with cell surface receptors. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. | 3486 | NA |
| actin, alpha 2, smooth muscle, aorta | ENSG00000107796 | ACTA2 | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | 59 | NA |
| testin LIM domain protein | ENSG00000135269 | TES | Cancer-associated chromosomal changes often involve regions containing fragile sites. This gene maps to a commom fragile site on chromosome 7q31.2 designated FRA7G. This gene is similar to mouse Testin, a testosterone-responsive gene encoding a Sertoli cell secretory protein containing three LIM domains. LIM domains are double zinc-finger motifs that mediate protein-protein interactions between transcription factors, cytoskeletal proteins and signaling proteins. This protein is a negative regulator of cell growth and may act as a tumor suppressor. This scaffold protein may also play a role in cell adhesion, cell spreading and in the reorganization of the actin cytoskeleton. Multiple protein isoforms are encoded by transcript variants of this gene. | 26136 | NA |
| pleckstrin | ENSG00000115956 | PLEK | NA | 5341 | NA |
| linker for activation of T-cells family member 2 | ENSG00000086730 | LAT2 | This gene is one of the contiguous genes at 7q11.23 commonly deleted in Williams syndrome, a multisystem developmental disorder. This gene consists of at least 14 exons, and its alternative splicing generates 3 transcript variants, all encoding the same protein. | 7462 | NA |
| troponin C1, slow skeletal and cardiac type | ENSG00000114854 | TNNC1 | Troponin is a central regulatory protein of striated muscle contraction, and together with tropomyosin, is located on the actin filament. Troponin consists of 3 subunits: TnI, which is the inhibitor of actomyosin ATPase; TnT, which contains the binding site for tropomyosin; and TnC, the protein encoded by this gene. The binding of calcium to TnC abolishes the inhibitory action of TnI, thus allowing the interaction of actin with myosin, the hydrolysis of ATP, and the generation of tension. Mutations in this gene are associated with cardiomyopathy dilated type 1Z. | 7134 | NA |
| LYN proto-oncogene, Src family tyrosine kinase | ENSG00000254087 | LYN | This gene encodes a tyrosine protein kinase, which maybe involved in the regulation of mast cell degranulation, and erythroid differentiation. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 4067 | NA |
| regulator of G-protein signaling 2 | ENSG00000116741 | RGS2 | Regulator of G protein signaling (RGS) family members are regulatory molecules that act as GTPase activating proteins (GAPs) for G alpha subunits of heterotrimeric G proteins. RGS proteins are able to deactivate G protein subunits of the Gi alpha, Go alpha and Gq alpha subtypes. They drive G proteins into their inactive GDP-bound forms. Regulator of G protein signaling 2 belongs to this family. The protein acts as a mediator of myeloid differentiation and may play a role in leukemogenesis. | 5997 | NA |
| ADAM metallopeptidase domain 8 | ENSG00000151651 | ADAM8 | This gene encodes a member of the ADAM (a disintegrin and metalloprotease domain) family. Members of this family are membrane-anchored proteins structurally related to snake venom disintegrins, and have been implicated in a variety of biological processes involving cell-cell and cell-matrix interactions, including fertilization, muscle development, and neurogenesis. The protein encoded by this gene may be involved in cell adhesion during neurodegeneration, and it is thought to be a target for allergic respiratory diseases, including asthma. Alternative splicing results in multiple transcript variants. | 101 | NA |
| platelet derived growth factor receptor beta | ENSG00000113721 | PDGFRB | This gene encodes a cell surface tyrosine kinase receptor for members of the platelet-derived growth factor family. These growth factors are mitogens for cells of mesenchymal origin. The identity of the growth factor bound to a receptor monomer determines whether the functional receptor is a homodimer or a heterodimer, composed of both platelet-derived growth factor receptor alpha and beta polypeptides. This gene is flanked on chromosome 5 by the genes for granulocyte-macrophage colony-stimulating factor and macrophage-colony stimulating factor receptor; all three genes may be implicated in the 5-q syndrome. A translocation between chromosomes 5 and 12, that fuses this gene to that of the translocation, ETV6, leukemia gene, results in chronic myeloproliferative disorder with eosinophilia. | 5159 | NA |
| synaptopodin 2 | ENSG00000172403 | SYNPO2 | NA | 171024 | NA |
| phosphodiesterase 4D interacting protein | ENSG00000178104 | PDE4DIP | The protein encoded by this gene serves to anchor phosphodiesterase 4D to the Golgi/centrosome region of the cell. Defects in this gene may be a cause of myeloproliferative disorder (MBD) associated with eosinophilia. Several transcript variants encoding different isoforms have been found for this gene. | 9659 | NA |
| four and a half LIM domains 2 | ENSG00000115641 | FHL2 | This gene encodes a member of the four-and-a-half-LIM-only protein family. Family members contain two highly conserved, tandemly arranged, zinc finger domains with four highly conserved cysteines binding a zinc atom in each zinc finger. This protein is thought to have a role in the assembly of extracellular membranes. Also, this gene is down-regulated during transformation of normal myoblasts to rhabdomyosarcoma cells and the encoded protein may function as a link between presenilin-2 and an intracellular signaling pathway. Multiple alternatively spliced variants encoding different isoforms have been identified. | 2274 | NA |
| CAP, adenylate cyclase-associated protein 1 (yeast) | ENSG00000131236 | CAP1 | The protein encoded by this gene is related to the S. cerevisiae CAP protein, which is involved in the cyclic AMP pathway. The human protein is able to interact with other molecules of the same protein, as well as with CAP2 and actin. Alternatively spliced transcript variants have been identified. | 10487 | NA |
| galectin 1 | ENSG00000100097 | LGALS1 | The galectins are a family of beta-galactoside-binding proteins implicated in modulating cell-cell and cell-matrix interactions. This gene product may act as an autocrine negative growth factor that regulates cell proliferation. | 3956 | NA |
| pleckstrin homology and RhoGEF domain containing G5 | ENSG00000171680 | PLEKHG5 | This gene encodes a protein that activates the nuclear factor kappa B (NFKB1) signaling pathway. Mutations in this gene are associated with autosomal recessive distal spinal muscular atrophy. Multiple transcript variants encoding different isoforms have been found for this gene. | 57449 | NA |
| LIM domain and actin binding 1 | ENSG00000050405 | LIMA1 | This gene encodes a cytoskeleton-associated protein that inhibits actin filament depolymerization and cross-links filaments in bundles. It is downregulated in some cancer cell lines. Alternatively spliced transcript variants encoding different isoforms have been described for this gene, and expression of some of the variants maybe independently regulated. | 51474 | NA |
| docking protein 3 | ENSG00000146094 | DOK3 | NA | 79930 | NA |
| Rho GTPase activating protein 45 | ENSG00000180448 | ARHGAP45 | NA | 23526 | NA |
| lipoprotein lipase | ENSG00000175445 | LPL | LPL encodes lipoprotein lipase, which is expressed in heart, muscle, and adipose tissue. LPL functions as a homodimer, and has the dual functions of triglyceride hydrolase and ligand/bridging factor for receptor-mediated lipoprotein uptake. Severe mutations that cause LPL deficiency result in type I hyperlipoproteinemia, while less extreme mutations in LPL are linked to many disorders of lipoprotein metabolism. | 4023 | NA |
| natriuretic peptide A | ENSG00000175206 | NPPA | The protein encoded by this gene belongs to the natriuretic peptide family. Natriuretic peptides are implicated in the control of extracellular fluid volume and electrolyte homeostasis. This protein is synthesized as a large precursor (containing a signal peptide), which is processed to release a peptide from the N-terminus with similarity to vasoactive peptide, cardiodilatin, and another peptide from the C-terminus with natriuretic-diuretic activity. Mutations in this gene have been associated with atrial fibrillation familial type 6. This gene is located adjacent to another member of the natriuretic family of peptides on chromosome 1. | 4878 | NA |
| TBC1 domain family member 1 | ENSG00000065882 | TBC1D1 | TBC1D1 is the founding member of a family of proteins sharing a 180- to 200-amino acid TBC domain presumed to have a role in regulating cell growth and differentiation. These proteins share significant homology with TRE2 (USP6; MIM 604334), yeast Bub2, and CDC16 (MIM 603461) (White et al., 2000 [PubMed 10965142]). | 23216 | NA |
| synemin | ENSG00000182253 | SYNM | The protein encoded by this gene is an intermediate filament (IF) family member. IF proteins are cytoskeletal proteins that confer resistance to mechanical stress and are encoded by a dispersed multigene family. This protein has been found to form a linkage between desmin, which is a subunit of the IF network, and the extracellular matrix, and provides an important structural support in muscle. Two alternatively spliced variants encoding different isoforms have been described for this gene. | 23336 | NA |
| S100 calcium binding protein A10 | ENSG00000197747 | S100A10 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in exocytosis and endocytosis. | 6281 | NA |
| pleckstrin and Sec7 domain containing 3 | ENSG00000156011 | PSD3 | NA | 23362 | NA |
| G-protein signaling modulator 3 | ENSG00000213654 | GPSM3 | NA | 63940 | NA |
| EGF containing fibulin like extracellular matrix protein 1 | ENSG00000115380 | EFEMP1 | This gene encodes a member of the fibulin family of extracellular matrix glycoproteins. Like all members of this family, the encoded protein contains tandemly repeated epidermal growth factor-like repeats followed by a C-terminus fibulin-type domain. This gene is upregulated in malignant gliomas and may play a role in the aggressive nature of these tumors. Mutations in this gene are associated with Doyne honeycomb retinal dystrophy. Alternatively spliced transcript variants that encode the same protein have been described. | 2202 | NA |
| Rho GDP dissociation inhibitor beta | ENSG00000111348 | ARHGDIB | Members of the Rho (or ARH) protein family (see MIM 165390) and other Ras-related small GTP-binding proteins (see MIM 179520) are involved in diverse cellular events, including cell signaling, proliferation, cytoskeletal organization, and secretion. The GTP-binding proteins are active only in the GTP-bound state. At least 3 classes of proteins tightly regulate cycling between the GTP-bound and GDP-bound states: GTPase-activating proteins (GAPs), guanine nucleotide-releasing factors (GRFs), and GDP-dissociation inhibitors (GDIs). The GDIs, including ARHGDIB, decrease the rate of GDP dissociation from Ras-like GTPases (summary by Scherle et al., 1993 [PubMed 8356058]). | 397 | NA |
| phosphatidylinositol-3,4,5-trisphosphate dependent Rac exchange factor 1 | ENSG00000124126 | PREX1 | The protein encoded by this gene acts as a guanine nucleotide exchange factor for the RHO family of small GTP-binding proteins (RACs). It has been shown to bind to and activate RAC1 by exchanging bound GDP for free GTP. The encoded protein, which is found mainly in the cytoplasm, is activated by phosphatidylinositol-3,4,5-trisphosphate and the beta-gamma subunits of heterotrimeric G proteins. | 57580 | NA |
| olfactomedin like 3 | ENSG00000116774 | OLFML3 | NA | 56944 | NA |
| neuritin 1 | ENSG00000124785 | NRN1 | This gene encodes a member of the neuritin family, and is expressed in postmitotic-differentiating neurons of the developmental nervous system and neuronal structures associated with plasticity in the adult. The expression of this gene can be induced by neural activity and neurotrophins. The encoded protein contains a consensus cleavage signal found in glycosylphoshatidylinositol (GPI)-anchored proteins. The encoded protein promotes neurite outgrowth and arborization, suggesting its role in promoting neuritogenesis. Overexpression of the encoded protein may be associated with astrocytoma progression. Alternative splicing results in multiple transcript variants. | 51299 | NA |
| thymocyte selection associated family member 2 | ENSG00000130775 | THEMIS2 | NA | 9473 | NA |
| myosin light chain kinase | ENSG00000065534 | MYLK | This gene, a muscle member of the immunoglobulin gene superfamily, encodes myosin light chain kinase which is a calcium/calmodulin dependent enzyme. This kinase phosphorylates myosin regulatory light chains to facilitate myosin interaction with actin filaments to produce contractile activity. This gene encodes both smooth muscle and nonmuscle isoforms. In addition, using a separate promoter in an intron in the 3’ region, it encodes telokin, a small protein identical in sequence to the C-terminus of myosin light chain kinase, that is independently expressed in smooth muscle and functions to stabilize unphosphorylated myosin filaments. A pseudogene is located on the p arm of chromosome 3. Four transcript variants that produce four isoforms of the calcium/calmodulin dependent enzyme have been identified as well as two transcripts that produce two isoforms of telokin. Additional variants have been identified but lack full length transcripts. | 4638 | NA |
| NA | ENSG00000263065 | AF001548.6 | NA | ENSG00000263065 | NA |
| ArfGAP with coiled-coil, ankyrin repeat and PH domains 1 | ENSG00000072818 | ACAP1 | NA | 9744 | NA |
| thrombospondin 1 | ENSG00000137801 | THBS1 | The protein encoded by this gene is a subunit of a disulfide-linked homotrimeric protein. This protein is an adhesive glycoprotein that mediates cell-to-cell and cell-to-matrix interactions. This protein can bind to fibrinogen, fibronectin, laminin, type V collagen and integrins alpha-V/beta-1. This protein has been shown to play roles in platelet aggregation, angiogenesis, and tumorigenesis. | 7057 | NA |
| Rho GTPase activating protein 30 | ENSG00000186517 | ARHGAP30 | NA | 257106 | NA |
| ACTA2 antisense RNA 1 | ENSG00000180139 | ACTA2-AS1 | NA | ENSG00000180139 | NA |
| tenascin XB | ENSG00000168477 | TNXB | This gene encodes a member of the tenascin family of extracellular matrix glycoproteins. The tenascins have anti-adhesive effects, as opposed to fibronectin which is adhesive. This protein is thought to function in matrix maturation during wound healing, and its deficiency has been associated with the connective tissue disorder Ehlers-Danlos syndrome. This gene localizes to the major histocompatibility complex (MHC) class III region on chromosome 6. It is one of four genes in this cluster which have been duplicated. The duplicated copy of this gene is incomplete and is a pseudogene which is transcribed but does not encode a protein. The structure of this gene is unusual in that it overlaps the CREBL1 and CYP21A2 genes at its 5’ and 3’ ends, respectively. Multiple transcript variants encoding different isoforms have been found for this gene. | 7148 | NA |
| serpin family F member 1 | ENSG00000132386 | SERPINF1 | The protein encoded by this gene is a member of the serpin family, although it does not display the serine protease inhibitory activity shown by many of the other serpin family members. The encoded protein is secreted and strongly inhibits angiogenesis. In addition, this protein is a neurotrophic factor involved in neuronal differentiation in retinoblastoma cells. | 5176 | NA |
| Jun proto-oncogene, AP-1 transcription factor subunit | ENSG00000177606 | JUN | This gene is the putative transforming gene of avian sarcoma virus 17. It encodes a protein which is highly similar to the viral protein, and which interacts directly with specific target DNA sequences to regulate gene expression. This gene is intronless and is mapped to 1p32-p31, a chromosomal region involved in both translocations and deletions in human malignancies. | 3725 | NA |
| MOB kinase activator 3A | ENSG00000172081 | MOB3A | NA | 126308 | NA |
| regulator of G-protein signaling 14 | ENSG00000169220 | RGS14 | This gene encodes a member of the regulator of G-protein signaling family. This protein contains one RGS domain, two Raf-like Ras-binding domains (RBDs), and one GoLoco domain. The protein attenuates the signaling activity of G-proteins by binding, through its GoLoco domain, to specific types of activated, GTP-bound G alpha subunits. Acting as a GTPase activating protein (GAP), the protein increases the rate of conversion of the GTP to GDP. This hydrolysis allows the G alpha subunits to bind G beta/gamma subunit heterodimers, forming inactive G-protein heterotrimers, thereby terminating the signal. Alternate transcriptional splice variants of this gene have been observed but have not been thoroughly characterized. | 10636 | NA |
| cysteine rich transmembrane BMP regulator 1 (chordin-like) | ENSG00000150938 | CRIM1 | This gene encodes a transmembrane protein containing six cysteine-rich repeat domains and an insulin-like growth factor-binding domain. The encoded protein may play a role in tissue development though interactions with members of the transforming growth factor beta family, such as bone morphogenetic proteins. | 51232 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",1,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[2,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
kable(as.data.frame(out))
| summary | X_id | query | symbol | name |
|---|---|---|---|---|
| This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | 2335 | ENSG00000115414 | FN1 | fibronectin 1 |
| This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | 3858 | ENSG00000186395 | KRT10 | keratin 10 |
| This gene encodes the alpha chain of type XVIII collagen. This collagen is one of the multiplexins, extracellular matrix proteins that contain multiple triple-helix domains (collagenous domains) interrupted by non-collagenous domains. A long isoform of the protein has an N-terminal domain that is homologous to the extracellular part of frizzled receptors. Proteolytic processing at several endogenous cleavage sites in the C-terminal domain results in production of endostatin, a potent antiangiogenic protein that is able to inhibit angiogenesis and tumor growth. Mutations in this gene are associated with Knobloch syndrome. The main features of this syndrome involve retinal abnormalities, so type XVIII collagen may play an important role in retinal structure and in neural tube closure. Alternative splicing results in multiple transcript variants. | 80781 | ENSG00000182871 | COL18A1 | collagen type XVIII alpha 1 chain |
| The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3848 | ENSG00000167768 | KRT1 | keratin 1 |
| This gene encodes apolipoprotein A-I, which is the major protein component of high density lipoprotein (HDL) in plasma. The encoded preproprotein is proteolytically processed to generate the mature protein, which promotes cholesterol efflux from tissues to the liver for excretion, and is a cofactor for lecithin cholesterolacyltransferase (LCAT), an enzyme responsible for the formation of most plasma cholesteryl esters. This gene is closely linked with two other apolipoprotein genes on chromosome 11. Defects in this gene are associated with HDL deficiencies, including Tangier disease, and with systemic non-neuropathic amyloidosis. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein. | 335 | ENSG00000118137 | APOA1 | apolipoprotein A1 |
| The protein encoded by this gene is a small secreted cysteine-rich protein and a member of the CCN family of regulatory proteins. CNN family proteins associate with the extracellular matrix and play an important role in cardiovascular and skeletal development, fibrosis and cancer development. | 4856 | ENSG00000136999 | NOV | nephroblastoma overexpressed |
| This gene encodes the heavy chain subunit of the pre-alpha-trypsin inhibitor complex. This complex may stabilize the extracellular matrix through its ability to bind hyaluronic acid. Polymorphisms of this gene may be associated with increased risk for schizophrenia and major depressive disorder. This gene is present in an inter-alpha-trypsin inhibitor family gene cluster on chromosome 3. | 3699 | ENSG00000162267 | ITIH3 | inter-alpha-trypsin inhibitor heavy chain 3 |
| The protein encoded by the classic MBP gene is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. However, MBP-related transcripts are also present in the bone marrow and the immune system. These mRNAs arise from the long MBP gene (otherwise called ‘Golli-MBP’) that contains 3 additional exons located upstream of the classic MBP exons. Alternative splicing from the Golli and the MBP transcription start sites gives rise to 2 sets of MBP-related transcripts and gene products. The Golli mRNAs contain 3 exons unique to Golli-MBP, spliced in-frame to 1 or more MBP exons. They encode hybrid proteins that have N-terminal Golli aa sequence linked to MBP aa sequence. The second family of transcripts contain only MBP exons and produce the well characterized myelin basic proteins. This complex gene structure is conserved among species suggesting that the MBP transcription unit is an integral part of the Golli transcription unit and that this arrangement is important for the function and/or regulation of these genes. | 4155 | ENSG00000197971 | MBP | myelin basic protein |
| The protein encoded by this gene is secreted and likely acts as an inhibitor of bone formation. The encoded protein is found in the organic matrix of bone and cartilage. Defects in this gene are a cause of Keutel syndrome (KS). Two transcript variants encoding different isoforms have been found for this gene. | 4256 | ENSG00000111341 | MGP | matrix Gla protein |
| The protein encoded by this gene, pre-angiotensinogen or angiotensinogen precursor, is expressed in the liver and is cleaved by the enzyme renin in response to lowered blood pressure. The resulting product, angiotensin I, is then cleaved by angiotensin converting enzyme (ACE) to generate the physiologically active enzyme angiotensin II. The protein is involved in maintaining blood pressure and in the pathogenesis of essential hypertension and preeclampsia. Mutations in this gene are associated with susceptibility to essential hypertension, and can cause renal tubular dysgenesis, a severe disorder of renal tubular development. Defects in this gene have also been associated with non-familial structural atrial fibrillation, and inflammatory bowel disease. | 183 | ENSG00000135744 | AGT | angiotensinogen |
| Albumin is a soluble, monomeric protein which comprises about one-half of the blood serum protein. Albumin functions primarily as a carrier protein for steroids, fatty acids, and thyroid hormones and plays a role in stabilizing extracellular fluid volume. Albumin is a globular unglycosylated serum protein of molecular weight 65,000. Albumin is synthesized in the liver as preproalbumin which has an N-terminal peptide that is removed before the nascent protein is released from the rough endoplasmic reticulum. The product, proalbumin, is in turn cleaved in the Golgi vesicles to produce the secreted albumin. | 213 | ENSG00000163631 | ALB | albumin |
| The protein encoded by this gene is a metalloprotein that binds most of the copper in plasma and is involved in the peroxidation of Fe(II)transferrin to Fe(III) transferrin. Mutations in this gene cause aceruloplasminemia, which results in iron accumulation and tissue damage, and is associated with diabetes and neurologic abnormalities. Two transcript variants, one protein-coding and the other not protein-coding, have been found for this gene. | 1356 | ENSG00000047457 | CP | ceruloplasmin (ferroxidase) |
| ARHGEF10L is a member of the RhoGEF family of guanine nucleotide exchange factors (GEFs) that activate Rho GTPases (Winkler et al., 2005 [PubMed 16112081]). | 55160 | ENSG00000074964 | ARHGEF10L | Rho guanine nucleotide exchange factor 10 like |
| The protein encoded by this gene is a transformation and shape-change sensitive actin cross-linking/gelling protein found in fibroblasts and smooth muscle. Its expression is down-regulated in many cell lines, and this down-regulation may be an early and sensitive marker for the onset of transformation. A functional role of this protein is unclear. Two transcript variants encoding the same protein have been found for this gene. | 6876 | ENSG00000149591 | TAGLN | transgelin |
| The protein encoded by this gene is a subunit of a disulfide-linked homotrimeric protein. This protein is an adhesive glycoprotein that mediates cell-to-cell and cell-to-matrix interactions. This protein can bind to fibrinogen, fibronectin, laminin, type V collagen and integrins alpha-V/beta-1. This protein has been shown to play roles in platelet aggregation, angiogenesis, and tumorigenesis. | 7057 | ENSG00000137801 | THBS1 | thrombospondin 1 |
| The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | 59 | ENSG00000107796 | ACTA2 | actin, alpha 2, smooth muscle, aorta |
| This gene encodes the anterior pituitary hormone prolactin. This secreted hormone is a growth regulator for many tissues, including cells of the immune system. It may also play a role in cell survival by suppressing apoptosis, and it is essential for lactation. Alternative splicing results in multiple transcript variants that encode the same protein. | 5617 | ENSG00000172179 | PRL | prolactin |
| This gene encodes a member of the intermediate filament family. Intermediate filamentents, along with microtubules and actin microfilaments, make up the cytoskeleton. The protein encoded by this gene is responsible for maintaining cell shape, integrity of the cytoplasm, and stabilizing cytoskeletal interactions. It is also involved in the immune response, and controls the transport of low-density lipoprotein (LDL)-derived cholesterol from a lysosome to the site of esterification. It functions as an organizer of a number of critical proteins involved in attachment, migration, and cell signaling. Mutations in this gene causes a dominant, pulverulent cataract. | 7431 | ENSG00000026025 | VIM | vimentin |
| The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | 4629 | ENSG00000133392 | MYH11 | myosin, heavy chain 11, smooth muscle |
| The protein encoded by this gene is a secreted chaperone that can under some stress conditions also be found in the cell cytosol. It has been suggested to be involved in several basic biological events such as cell death, tumor progression, and neurodegenerative disorders. Alternate splicing results in both coding and non-coding variants. | 1191 | ENSG00000120885 | CLU | clusterin |
| The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3849 | ENSG00000172867 | KRT2 | keratin 2 |
| Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC). | 70 | ENSG00000159251 | ACTC1 | actin, alpha, cardiac muscle 1 |
| This gene encodes the alpha subunit of the coagulation factor fibrinogen, which is a component of the blood clot. Following vascular injury, the encoded preproprotein is proteolytically processed by thrombin during the conversion of fibrinogen to fibrin. Mutations in this gene lead to several disorders, including dysfibrinogenemia, hypofibrinogenemia, afibrinogenemia and renal amyloidosis. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. | 2243 | ENSG00000171560 | FGA | fibrinogen alpha chain |
| Apolipoprotein C-III is a very low density lipoprotein (VLDL) protein. APOC3 inhibits lipoprotein lipase and hepatic lipase; it is thought to delay catabolism of triglyceride-rich particles. The APOA1, APOC3 and APOA4 genes are closely linked in both rat and human genomes. The A-I and A-IV genes are transcribed from the same strand, while the A-1 and C-III genes are convergently transcribed. An increase in apoC-III levels induces the development of hypertriglyceridemia. | 345 | ENSG00000110245 | APOC3 | apolipoprotein C3 |
| The mitochondrial enzyme encoded by this gene catalyzes synthesis of carbamoyl phosphate from ammonia and bicarbonate. This reaction is the first committed step of the urea cycle, which is important in the removal of excess urea from cells. The encoded protein may also represent a core mitochondrial nucleoid protein. Three transcript variants encoding different isoforms have been found for this gene. The shortest isoform may not be localized to the mitochondrion. Mutations in this gene have been associated with carbamoyl phosphate synthetase deficiency, susceptibility to persistent pulmonary hypertension, and susceptibility to venoocclusive disease after bone marrow transplantation. | 1373 | ENSG00000021826 | CPS1 | carbamoyl-phosphate synthase 1 |
| The protein encoded by this gene is a glutathione-independent prostaglandin D synthase that catalyzes the conversion of prostaglandin H2 (PGH2) to postaglandin D2 (PGD2). PGD2 functions as a neuromodulator as well as a trophic factor in the central nervous system. PGD2 is also involved in smooth muscle contraction/relaxation and is a potent inhibitor of platelet aggregation. This gene is preferentially expressed in brain. Studies with transgenic mice overexpressing this gene suggest that this gene may be also involved in the regulation of non-rapid eye movement sleep. | 5730 | ENSG00000107317 | PTGDS | prostaglandin D2 synthase |
| The protein encoded by this gene is the beta component of fibrinogen, a blood-borne glycoprotein comprised of three pairs of nonidentical polypeptide chains. Following vascular injury, fibrinogen is cleaved by thrombin to form fibrin which is the most abundant component of blood clots. In addition, various cleavage products of fibrinogen and fibrin regulate cell adhesion and spreading, display vasoconstrictor and chemotactic activities, and are mitogens for several cell types. Mutations in this gene lead to several disorders, including afibrinogenemia, dysfibrinogenemia, hypodysfibrinogenemia and thrombotic tendency. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 2244 | ENSG00000171564 | FGB | fibrinogen beta chain |
| This gene encodes a regulatory subunit of protein phosphatase-1 (PP1). PP1 catalyzes reversible protein phosphorylation, which is important in a wide range of cellular activities: neuronal, muscular, RNA splicing, protein synthesis, cell death, and glycogen metabolism, to name just a few. By interacting with different regulatory subunits, PP1 is directed to different parts of the cell, to different substrates, or to respond to extracellular signals. | 5507 | ENSG00000119938 | PPP1R3C | protein phosphatase 1 regulatory subunit 3C |
| The protein encoded by this gene is an enzyme in the catabolic pathway of tyrosine. The encoded protein catalyzes the conversion of 4-hydroxyphenylpyruvate to homogentisate. Defects in this gene are a cause of tyrosinemia type 3 (TYRO3) and hawkinsinuria (HAWK). Two transcript variants encoding different isoforms have been found for this gene. | 3242 | ENSG00000158104 | HPD | 4-hydroxyphenylpyruvate dioxygenase |
| The protein encoded by this gene is one of six similar proteins that bind insulin-like growth factors I and II (IGF-I and IGF-II). The encoded protein can be secreted into the bloodstream, where it binds IGF-I and IGF-II with high affinity, or it can remain intracellular, interacting with many different ligands. High expression levels of this protein promote the growth of several types of tumors and may be predictive of the chances of recovery of the patient. Several transcript variants, one encoding a secreted isoform and the others encoding nonsecreted isoforms, have been found for this gene. | 3485 | ENSG00000115457 | IGFBP2 | insulin like growth factor binding protein 2 |
| The protein encoded by this gene is a transmembrane (type I) heparan sulfate proteoglycan and is a member of the syndecan proteoglycan family. The syndecans mediate cell binding, cell signaling, and cytoskeletal organization and syndecan receptors are required for internalization of the HIV-1 tat protein. The syndecan-1 protein functions as an integral membrane protein and participates in cell proliferation, cell migration and cell-matrix interactions via its receptor for extracellular matrix proteins. Altered syndecan-1 expression has been detected in several different tumor types. While several transcript variants may exist for this gene, the full-length natures of only two have been described to date. These two represent the major variants of this gene and encode the same protein. | 6382 | ENSG00000115884 | SDC1 | syndecan 1 |
| This gene is upregulated in inflammatory diseases, and it was first observed as expressed in the differentiated layers of skin. The most interesting aspect of this gene is the differential use of promoters and terminators to generate isoforms with unique cellular distributions and domain components. Alternatively spliced transcript variants encoding different isoforms have been identified for this gene. | 93099 | ENSG00000161249 | DMKN | dermokine |
| The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. Mutations in this gene have been associated with familial hypertrophic cardiomyopathy as well as with dilated cardiomyopathy. Transcripts for this gene undergo alternative splicing that results in many tissue-specific isoforms, however, the full-length nature of some of these variants has not yet been determined. | 7139 | ENSG00000118194 | TNNT2 | troponin T2, cardiac type |
| This gene encodes a member of the CHD family of proteins which are characterized by the presence of chromo (chromatin organization modifier) domains and SNF2-related helicase/ATPase domains. This protein is one of the components of a histone deacetylase complex referred to as the Mi-2/NuRD complex which participates in the remodeling of chromatin by deacetylating histones. Chromatin remodeling is essential for many processes including transcription. Autoantibodies against this protein are found in a subset of patients with dermatomyositis. Three alternatively spliced transcripts encoding different isoforms have been described. | 1107 | ENSG00000170004 | CHD3 | chromodomain helicase DNA binding protein 3 |
| Arg and c-Abl represent the mammalian members of the Abelson family of non-receptor protein-tyrosine kinases. They interact with the Arg/Abl binding proteins via the SH3 domains present in the carboxy end of the latter group of proteins. This gene encodes the sorbin and SH3 domain containing 2 protein. It has three C-terminal SH3 domains and an N-terminal sorbin homology (SoHo) domain that interacts with lipid raft proteins. The subcellular localization of this protein in epithelial and cardiac muscle cells suggests that it functions as an adapter protein to assemble signaling complexes in stress fibers, and that it is a potential link between Abl family kinases and the actin cytoskeleton. Alternative splicing results in multiple transcript variants encoding different isoforms. | 8470 | ENSG00000154556 | SORBS2 | sorbin and SH3 domain containing 2 |
| This gene encodes a member of the small leucine-rich proteoglycan family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature protein. This protein plays a role in collagen fibril assembly. Binding of this protein to multiple cell surface receptors mediates its role in tumor suppression, including a stimulatory effect on autophagy and inflammation and an inhibitory effect on angiogenesis and tumorigenesis. This gene and the related gene biglycan are thought to be the result of a gene duplication. Mutations in this gene are associated with congenital stromal corneal dystrophy in human patients. | 1634 | ENSG00000011465 | DCN | decorin |
| Members of the CELF/BRUNOL protein family contain two N-terminal RNA recognition motif (RRM) domains, one C-terminal RRM domain, and a divergent segment of 160-230 aa between the second and third RRM domains. Members of this protein family regulate pre-mRNA alternative splicing and may also be involved in mRNA editing, and translation. Alternative splicing results in multiple transcript variants encoding different isoforms. | 10659 | ENSG00000048740 | CELF2 | CUGBP, Elav-like family member 2 |
| This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | 1674 | ENSG00000175084 | DES | desmin |
| This gene encodes a key acute phase plasma protein. Because of its increase due to acute inflammation, this protein is classified as an acute-phase reactant. The specific function of this protein has not yet been determined; however, it may be involved in aspects of immunosuppression. | 5004 | ENSG00000229314 | ORM1 | orosomucoid 1 |
| This gene encodes a mitochondrially localized enzyme that catalyzes the reversible formation of acetoacetyl-CoA from two molecules of acetyl-CoA. Defects in this gene are associated with 3-ketothiolase deficiency, an inborn error of isoleucine catabolism characterized by urinary excretion of 2-methyl-3-hydroxybutyric acid, 2-methylacetoacetic acid, tiglylglycine, and butanone. | 38 | ENSG00000075239 | ACAT1 | acetyl-CoA acetyltransferase 1 |
| This gene encodes a preproprotein, which is processed to yield both alpha and beta chains, which subsequently combine as a tetramer to produce haptoglobin. Haptoglobin functions to bind free plasma hemoglobin, which allows degradative enzymes to gain access to the hemoglobin, while at the same time preventing loss of iron through the kidneys and protecting the kidneys from damage by hemoglobin. Mutations in this gene and/or its regulatory regions cause ahaptoglobinemia or hypohaptoglobinemia. This gene has also been linked to diabetic nephropathy, the incidence of coronary artery disease in type 1 diabetes, Crohn’s disease, inflammatory disease behavior, primary sclerosing cholangitis, susceptibility to idiopathic Parkinson’s disease, and a reduced incidence of Plasmodium falciparum malaria. The protein encoded also exhibits antimicrobial activity against bacteria. A similar duplicated gene is located next to this gene on chromosome 16. Multiple transcript variants encoding different isoforms have been found for this gene. | 3240 | ENSG00000257017 | HP | haptoglobin |
| This gene encodes a member of the transducer of erbB-2 /B-cell translocation gene protein family. Members of this family are anti-proliferative factors that have the potential to regulate cell growth. The encoded protein may function as a tumor suppressor. Alternate splicing results in multiple transcript variants. | 10140 | ENSG00000141232 | TOB1 | transducer of ERBB2, 1 |
| This gene is a member of the Regulator of Complement Activation (RCA) gene cluster and encodes a protein with twenty short consensus repeat (SCR) domains. This protein is secreted into the bloodstream and has an essential role in the regulation of complement activation, restricting this innate defense mechanism to microbial infections. Mutations in this gene have been associated with hemolytic-uremic syndrome (HUS) and chronic hypocomplementemic nephropathy. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. | 3075 | ENSG00000000971 | CFH | complement factor H |
| This gene encodes a leucine-rich cytoplasmic protein, which is highly similar to a mouse protein that negatively regulates Ca/calmodulin-dependent protein kinase II phosphorylation and may be essential for spatial learning processes. Several alternatively spliced transcript variants of this gene have been described. | 23154 | ENSG00000020129 | NCDN | neurochondrin |
| This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIA, Ehlers-Danlos syndrome Classical type, Caffey Disease and idiopathic osteoporosis. Reciprocal translocations between chromosomes 17 and 22, where this gene and the gene for platelet-derived growth factor beta are located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans, resulting from unregulated expression of the growth factor. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | 1277 | ENSG00000108821 | COL1A1 | collagen type I alpha 1 |
| This gene encodes the heavy subunit of ferritin, the major intracellular iron storage protein in prokaryotes and eukaryotes. It is composed of 24 subunits of the heavy and light ferritin chains. Variation in ferritin subunit composition may affect the rates of iron uptake and release in different tissues. A major function of ferritin is the storage of iron in a soluble and nontoxic state. Defects in ferritin proteins are associated with several neurodegenerative diseases. This gene has multiple pseudogenes. Several alternatively spliced transcript variants have been observed, but their biological validity has not been determined. | 2495 | ENSG00000167996 | FTH1 | ferritin heavy chain 1 |
| This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | 5644 | ENSG00000204983 | PRSS1 | protease, serine 1 |
| The protein encoded by this gene belongs to the glutamine synthetase family. It catalyzes the synthesis of glutamine from glutamate and ammonia in an ATP-dependent reaction. This protein plays a role in ammonia and glutamate detoxification, acid-base homeostasis, cell signaling, and cell proliferation. Glutamine is an abundant amino acid, and is important to the biosynthesis of several amino acids, pyrimidines, and purines. Mutations in this gene are associated with congenital glutamine deficiency, and overexpression of this gene was observed in some primary liver cancer samples. There are six pseudogenes of this gene found on chromosomes 2, 5, 9, 11, and 12. Alternative splicing results in multiple transcript variants. | 2752 | ENSG00000135821 | GLUL | glutamate-ammonia ligase |
| This gene encodes a member of the insulin-like growth factor (IGF)-binding protein (IGFBP) family. IGFBPs bind IGFs with high affinity, and regulate IGF availability in body fluids and tissues and modulate IGF binding to its receptors. This protein binds IGF-I and IGF-II with relatively low affinity, and belongs to a subfamily of low-affinity IGFBPs. It also stimulates prostacyclin production and cell adhesion. Alternatively spliced transcript variants encoding different isoforms have been described for this gene, and one variant has been associated with retinal arterial macroaneurysm (PMID:21835307). | 3490 | ENSG00000163453 | IGFBP7 | insulin like growth factor binding protein 7 |
| This gene encodes a protein that anchors intermediate filaments to desmosomal plaques and forms an obligate component of functional desmosomes. Mutations in this gene are the cause of several cardiomyopathies and keratodermas, including skin fragility-woolly hair syndrome. Alternative splicing results in multiple transcript variants. | 1832 | ENSG00000096696 | DSP | desmoplakin |
| The protein encoded by this gene belongs to the Kank family of proteins, which contain multiple ankyrin repeat domains. This family member functions in cytoskeleton formation by regulating actin polymerization. This gene is a candidate tumor suppressor for renal cell carcinoma. Mutations in this gene cause cerebral palsy spastic quadriplegic type 2, a central nervous system development disorder. A t(5;9) translocation results in fusion of the platelet-derived growth factor receptor beta gene (PDGFRB) on chromosome 5 with this gene in a myeloproliferative neoplasm featuring severe thrombocythemia. Alternative splicing of this gene results in multiple transcript variants. A related pseudogene has been identified on chromosome 20. | 23189 | ENSG00000107104 | KANK1 | KN motif and ankyrin repeat domains 1 |
| The protein encoded by this gene is the gamma component of fibrinogen, a blood-borne glycoprotein comprised of three pairs of nonidentical polypeptide chains. Following vascular injury, fibrinogen is cleaved by thrombin to form fibrin which is the most abundant component of blood clots. In addition, various cleavage products of fibrinogen and fibrin regulate cell adhesion and spreading, display vasoconstrictor and chemotactic activities, and are mitogens for several cell types. Mutations in this gene lead to several disorders, including dysfibrinogenemia, hypofibrinogenemia and thrombophilia. Alternative splicing results in transcript variants encoding different isoforms. | 2266 | ENSG00000171557 | FGG | fibrinogen gamma chain |
| The protein encoded by this gene is a member of the kinesin family and functions as an anterograde motor protein that transports membranous organelles along axonal microtubules. Mutations at this locus have been associated with spastic paraplegia-30 and hereditary sensory neuropathy IIC. Alternatively spliced transcript variants encoding distinct isoforms have been described. | 547 | ENSG00000130294 | KIF1A | kinesin family member 1A |
| The enzyme encoded by this gene is a multifunctional protein. Its main function is to catalyze the synthesis of palmitate from acetyl-CoA and malonyl-CoA, in the presence of NADPH, into long-chain saturated fatty acids. In some cancer cell lines, this protein has been found to be fused with estrogen receptor-alpha (ER-alpha), in which the N-terminus of FAS is fused in-frame with the C-terminus of ER-alpha. | 2194 | ENSG00000169710 | FASN | fatty acid synthase |
| The protein encoded by this gene belongs to a family of bifunctional proteins that are involved in both the synthesis and degradation of fructose-2,6-bisphosphate, a regulatory molecule that controls glycolysis in eukaryotes. The encoded protein has a 6-phosphofructo-2-kinase activity that catalyzes the synthesis of fructose-2,6-bisphosphate (F2,6BP), and a fructose-2,6-biphosphatase activity that catalyzes the degradation of F2,6BP. This protein is required for cell cycle progression and prevention of apoptosis. It functions as a regulator of cyclin-dependent kinase 1, linking glucose metabolism to cell proliferation and survival in tumor cells. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 5209 | ENSG00000170525 | PFKFB3 | 6-phosphofructo-2-kinase/fructose-2,6-biphosphatase 3 |
| This gene encodes a nebulin like protein that is abundantly expressed in cardiac muscle. The encoded protein binds actin and interacts with thin filaments and Z-line associated proteins in striated muscle. This protein may be involved in cardiac myofibril assembly. A shorter isoform of this protein termed LIM nebulette is expressed in non-muscle cells and may function as a component of focal adhesion complexes. Alternate splicing results in multiple transcript variants. | 10529 | ENSG00000078114 | NEBL | nebulette |
| Fructose-1,6-bisphosphate aldolase (EC 4.1.2.13) is a tetrameric glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Vertebrates have 3 aldolase isozymes which are distinguished by their electrophoretic and catalytic properties. Differences indicate that aldolases A, B, and C are distinct proteins, the products of a family of related ‘housekeeping’ genes exhibiting developmentally regulated expression of the different isozymes. The developing embryo produces aldolase A, which is produced in even greater amounts in adult muscle where it can be as much as 5% of total cellular protein. In adult liver, kidney and intestine, aldolase A expression is repressed and aldolase B is produced. In brain and other nervous tissue, aldolase A and C are expressed about equally. There is a high degree of homology between aldolase A and C. Defects in ALDOB cause hereditary fructose intolerance. | 229 | ENSG00000136872 | ALDOB | aldolase, fructose-bisphosphate B |
| This gene encodes a preproprotein that undergoes extensive, tissue-specific, post-translational processing via cleavage by subtilisin-like enzymes known as prohormone convertases. There are eight potential cleavage sites within the preproprotein and, depending on tissue type and the available convertases, processing may yield as many as ten biologically active peptides involved in diverse cellular functions. The encoded protein is synthesized mainly in corticotroph cells of the anterior pituitary where four cleavage sites are used; adrenocorticotrophin, essential for normal steroidogenesis and the maintenance of normal adrenal weight, and lipotropin beta are the major end products. In other tissues, including the hypothalamus, placenta, and epithelium, all cleavage sites may be used, giving rise to peptides with roles in pain and energy homeostasis, melanocyte stimulation, and immune modulation. These include several distinct melanotropins, lipotropins, and endorphins that are contained within the adrenocorticotrophin and beta-lipotropin peptides. The antimicrobial melanotropin alpha peptide exhibits antibacterial and antifungal activity. Mutations in this gene have been associated with early onset obesity, adrenal insufficiency, and red hair pigmentation. Alternatively spliced transcript variants encoding the same protein have been described. | 5443 | ENSG00000115138 | POMC | proopiomelanocortin |
| The protein encoded by this gene is found as a pentamer and is a major substrate for the cAMP-dependent protein kinase in cardiac muscle. The encoded protein is an inhibitor of cardiac muscle sarcoplasmic reticulum Ca(2+)-ATPase in the unphosphorylated state, but inhibition is relieved upon phosphorylation of the protein. The subsequent activation of the Ca(2+) pump leads to enhanced muscle relaxation rates, thereby contributing to the inotropic response elicited in heart by beta-agonists. The encoded protein is a key regulator of cardiac diastolic function. Mutations in this gene are a cause of inherited human dilated cardiomyopathy with refractory congestive heart failure, and also familial hypertrophic cardiomyopathy. | 5350 | ENSG00000198523 | PLN | phospholamban |
| Tight junctions represent one mode of cell-to-cell adhesion in epithelial or endothelial cell sheets, forming continuous seals around cells and serving as a physical barrier to prevent solutes and water from passing freely through the paracellular space. These junctions are comprised of sets of continuous networking strands in the outwardly facing cytoplasmic leaflet, with complementary grooves in the inwardly facing extracytoplasmic leaflet. The protein encoded by this gene, a member of the claudin family, is an integral membrane protein and a component of tight junction strands. Loss of function mutations result in neonatal ichthyosis-sclerosing cholangitis syndrome. | 9076 | ENSG00000163347 | CLDN1 | claudin 1 |
| The protein encoded by this gene is a plasma glycoprotein of unknown function. The protein shows sequence similarity to the variable regions of some immunoglobulin supergene family member proteins. | 1 | ENSG00000121410 | A1BG | alpha-1-B glycoprotein |
| The protein encoded by this gene is a member of the somatotropin/prolactin family of hormones which play an important role in growth control. The gene, along with four other related genes, is located at the growth hormone locus on chromosome 17 where they are interspersed in the same transcriptional orientation; an arrangement which is thought to have evolved by a series of gene duplications. The five genes share a remarkably high degree of sequence identity. Alternative splicing generates additional isoforms of each of the five growth hormones, leading to further diversity and potential for specialization. This particular family member is expressed in the pituitary but not in placental tissue as is the case for the other four genes in the growth hormone locus. Mutations in or deletions of the gene lead to growth hormone deficiency and short stature. | 2688 | ENSG00000259384 | GH1 | growth hormone 1 |
| The protein encoded by this gene is a member of the protein tyrosine phosphatase (PTP) family. PTPs are known to be signaling molecules that regulate a variety of cellular processes including cell growth, differentiation, mitotic cycle, and oncogenic transformation. This protein contains a C-terminal PTP domain and an N-terminal domain homologous to the band 4.1 superfamily of cytoskeletal-associated proteins. P97, a cell cycle regulator involved in a variety of membrane related functions, has been shown to be a substrate of this PTP. This PTP was also found to interact with, and be regulated by adaptor protein 14-3-3 beta. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 5774 | ENSG00000070159 | PTPN3 | protein tyrosine phosphatase, non-receptor type 3 |
| This gene encodes a member of the paralemmin protein family. The product of this gene is a prenylated and palmitoylated phosphoprotein that associates with the cytoplasmic face of plasma membranes and is implicated in plasma membrane dynamics in neurons and other cell types. Several alternatively spliced transcript variants have been identified, but the full-length nature of only two transcript variants has been determined. | 5064 | ENSG00000099864 | PALM | paralemmin |
| NA | 81618 | ENSG00000135916 | ITM2C | integral membrane protein 2C |
| This gene encodes a Plekstrin homology and SEC7 domains-containing protein that functions as a guanine nucleotide exchange factor. The encoded protein regulates signal transduction by activating ADP-ribosylation factor 6. Alternative splicing results in multiple transcript variants. | 5662 | ENSG00000059915 | PSD | pleckstrin and Sec7 domain containing |
| This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. | 1357 | ENSG00000091704 | CPA1 | carboxypeptidase A1 |
| The protein encoded by this gene is secreted and is a serine protease inhibitor whose targets include elastase, plasmin, thrombin, trypsin, chymotrypsin, and plasminogen activator. Defects in this gene can cause emphysema or liver disease. Several transcript variants encoding the same protein have been found for this gene. | 5265 | ENSG00000197249 | SERPINA1 | serpin family A member 1 |
| NA | 3488 | ENSG00000115461 | IGFBP5 | insulin like growth factor binding protein 5 |
| This gene encodes a plasma glycoprotein that binds heme with high affinity. The encoded protein is an acute phase protein that transports heme from the plasma to the liver and may be involved in protecting cells from oxidative stress. | 3263 | ENSG00000110169 | HPX | hemopexin |
| This gene encodes a member of the regulators of G protein signaling (RGS) family. The RGS proteins are signal transduction molecules which are involved in the regulation of heterotrimeric G proteins by acting as GTPase activators. This gene is a hypoxia-inducible factor-1 dependent, hypoxia-induced gene which is involved in the induction of endothelial apoptosis. This gene is also one of three genes on chromosome 1q contributing to elevated blood pressure. Alternatively spliced transcript variants have been identified. | 8490 | ENSG00000143248 | RGS5 | regulator of G-protein signaling 5 |
| This gene is a member of the lipase gene family. It encodes a carboxyl esterase that hydrolyzes insoluble, emulsified triglycerides, and is essential for the efficient digestion of dietary fats. This gene is expressed specifically in the pancreas. | 5406 | ENSG00000175535 | PNLIP | pancreatic lipase |
| Carbonic anhydrases (CAs) are a large family of zinc metalloenzymes that catalyze the reversible hydration of carbon dioxide. They participate in a variety of biological processes, including respiration, calcification, acid-base balance, bone resorption, and the formation of aqueous humor, cerebrospinal fluid, saliva, and gastric acid. They show extensive diversity in tissue distribution and in their subcellular localization. CA XI is likely a secreted protein, however, radical changes at active site residues completely conserved in CA isozymes with catalytic activity, make it unlikely that it has carbonic anhydrase activity. It shares properties in common with two other acatalytic CA isoforms, CA VIII and CA X. CA XI is most abundantly expressed in brain, and may play a general role in the central nervous system. | 770 | ENSG00000063180 | CA11 | carbonic anhydrase 11 |
| This gene encodes a member of the membrane-associated guanylate kinase (MAGUK) family. It heteromultimerizes with another MAGUK protein, DLG2, and is recruited into NMDA receptor and potassium channel clusters. These two MAGUK proteins may interact at postsynaptic sites to form a multimeric scaffold for the clustering of receptors, ion channels, and associated signaling proteins. Multiple transcript variants encoding different isoforms have been found for this gene. | 1742 | ENSG00000132535 | DLG4 | discs large MAGUK scaffold protein 4 |
| This gene encodes a member of the phosphatidylethanolamine-binding family of proteins and has been shown to modulate multiple signaling pathways, including the MAP kinase (MAPK), NF-kappa B, and glycogen synthase kinase-3 (GSK-3) signaling pathways. The encoded protein can be further processed to form a smaller cleavage product, hippocampal cholinergic neurostimulating peptide (HCNP), which may be involved in neural development. This gene has been implicated in numerous human cancers and may act as a metastasis suppressor gene. Multiple pseudogenes of this gene have been identified in the genome. | 5037 | ENSG00000089220 | PEBP1 | phosphatidylethanolamine binding protein 1 |
| NA | 255743 | ENSG00000168743 | NPNT | nephronectin |
| NA | ENSG00000266844 | ENSG00000266844 | RP11-862L9.3 | NA |
| This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. | 2813 | ENSG00000169347 | GP2 | glycoprotein 2 |
| This gene encodes a member of the keratin family, the most diverse group of intermediate filaments. This gene product, a type I keratin, is usually found as a heterotetramer with two keratin 5 molecules, a type II keratin. Together they form the cytoskeleton of epithelial cells. Mutations in the genes for these keratins are associated with epidermolysis bullosa simplex. At least one pseudogene has been identified at 17p12-p11. | 3861 | ENSG00000186847 | KRT14 | keratin 14 |
| NA | ENSG00000268230 | ENSG00000268230 | CTD-2619J13.8 | NA |
| This gene encodes a classical cadherin and member of the cadherin superfamily. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein is proteolytically processed to generate a calcium-dependent cell adhesion molecule and glycoprotein. This protein plays a role in the establishment of left-right asymmetry, development of the nervous system and the formation of cartilage and bone. | 1000 | ENSG00000170558 | CDH2 | cadherin 2 |
| Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. | 4624 | ENSG00000197616 | MYH6 | myosin, heavy chain 6, cardiac muscle, alpha |
| The protein encoded by this gene is a member of the pexin family. It is found in serum and tissues and promotes cell adhesion and spreading, inhibits the membrane-damaging effect of the terminal cytolytic complement pathway, and binds to several serpin serine protease inhibitors. It is a secreted protein and exists in either a single chain form or a clipped, two chain form held together by a disulfide bond. | 7448 | ENSG00000109072 | VTN | vitronectin |
| This gene encodes a bifunctional signal transduction molecule. Dopaminergic and glutamatergic receptor stimulation regulates its phosphorylation and function as a kinase or phosphatase inhibitor. As a target for dopamine, this gene may serve as a therapeutic target for neurologic and psychiatric disorders. Multiple transcript variants encoding different isoforms have been found for this gene. | 84152 | ENSG00000131771 | PPP1R1B | protein phosphatase 1 regulatory inhibitor subunit 1B |
| The protein encoded by this gene is one of two large chain components of the assembly protein complex 2, which serves to link clathrin to receptors in coated vesicles. The encoded protein is found on the cytoplasmic face of coated vesicles in the plasma membrane. Two transcript variants encoding different isoforms have been found for this gene. | 163 | ENSG00000006125 | AP2B1 | adaptor related protein complex 2 beta 1 subunit |
| Arginase catalyzes the hydrolysis of arginine to ornithine and urea. At least two isoforms of mammalian arginase exist (types I and II) which differ in their tissue distribution, subcellular localization, immunologic crossreactivity and physiologic function. The type I isoform encoded by this gene, is a cytosolic enzyme and expressed predominantly in the liver as a component of the urea cycle. Inherited deficiency of this enzyme results in argininemia, an autosomal recessive disorder characterized by hyperammonemia. Two transcript variants encoding different isoforms have been found for this gene. | 383 | ENSG00000118520 | ARG1 | arginase 1 |
| This gene encodes an enzyme involved in fatty acid biosynthesis, primarily the synthesis of oleic acid. The protein belongs to the fatty acid desaturase family and is an integral membrane protein located in the endoplasmic reticulum. Transcripts of approximately 3.9 and 5.2 kb, differing only by alternative polyadenlyation signals, have been detected. A gene encoding a similar enzyme is located on chromosome 4 and a pseudogene of this gene is located on chromosome 17. | 6319 | ENSG00000099194 | SCD | stearoyl-CoA desaturase |
| This gene encodes a thiamine-dependent enzyme which plays a role in the channeling of excess sugar phosphates to glycolysis in the pentose phosphate pathway. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 7086 | ENSG00000163931 | TKT | transketolase |
| This gene encodes one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The product of this gene contains several domains similar to von Willebrand Factor type A domains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in this gene are associated with Bethlem myopathy and Ullrich scleroatonic muscular dystrophy. Three transcript variants have been identified for this gene. | 1292 | ENSG00000142173 | COL6A2 | collagen type VI alpha 2 |
| This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 7169 | ENSG00000198467 | TPM2 | tropomyosin 2 (beta) |
| NA | 4495 | ENSG00000125144 | MT1G | metallothionein 1G |
| NA | ENSG00000225670 | ENSG00000225670 | CADM3-AS1 | CADM3 antisense RNA 1 |
| This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. The encoded protein metabolizes drugs as well as the steroid hormones testosterone and progesterone. This gene is part of a cluster of cytochrome P450 genes on chromosome 7q21.1. Two pseudogenes of this gene have been identified within this cluster on chromosome 7. Expression of this gene is widely variable among populations, and a single nucleotide polymorphism that affects transcript splicing has been associated with susceptibility to hypertensions. Alternative splicing results in multiple transcript variants. | 1577 | ENSG00000106258 | CYP3A5 | cytochrome P450 family 3 subfamily A member 5 |
| This gene encodes a protein that belongs to the microtubule-associated protein family. The proteins of this family are thought to be involved in microtubule assembly, which is an essential step in neurogenesis. The product of this gene is a precursor polypeptide that presumably undergoes proteolytic processing to generate the final MAP1A heavy chain and LC2 light chain. Expression of this gene is almost exclusively in the brain. Studies of the rat microtubule-associated protein 1A gene suggested a role in early events of spinal cord development. | 4130 | ENSG00000166963 | MAP1A | microtubule associated protein 1A |
| This gene encodes a member of the FXYD family of transmembrane proteins. This particular protein encodes phosphohippolin, which likely affects the activity of Na,K-ATPase. Multiple alternatively spliced transcript variants encoding the same protein have been described. Related pseudogenes have been identified on chromosomes 10 and X. Read-through transcripts have been observed between this locus and the downstream sodium/potassium-transporting ATPase subunit gamma (FXYD2, GeneID 486) locus. | 53826 | ENSG00000137726 | FXYD6 | FXYD domain containing ion transport regulator 6 |
| Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. | 10136 | ENSG00000142789 | CELA3A | chymotrypsin like elastase family member 3A |
| This gene encodes an integral membrane protein that is a major component of myelin in the peripheral nervous system. Studies suggest two alternately used promoters drive tissue-specific expression. Various mutations of this gene are causes of Charcot-Marie-Tooth disease Type IA, Dejerine-Sottas syndrome, and hereditary neuropathy with liability to pressure palsies. Alternative splicing results in multiple transcript variants. | 5376 | ENSG00000109099 | PMP22 | peripheral myelin protein 22 |
| The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | 3860 | ENSG00000171401 | KRT13 | keratin 13 |
| This gene encodes a member of the dynamin subfamily of GTP-binding proteins. The encoded protein possesses unique mechanochemical properties used to tubulate and sever membranes, and is involved in clathrin-mediated endocytosis and other vesicular trafficking processes. Actin and other cytoskeletal proteins act as binding partners for the encoded protein, which can also self-assemble leading to stimulation of GTPase activity. More than sixty highly conserved copies of the 3’ region of this gene are found elsewhere in the genome, particularly on chromosomes Y and 15. Alternatively spliced transcript variants encoding different isoforms have been described. | 1759 | ENSG00000106976 | DNM1 | dynamin 1 |
| Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | 72 | ENSG00000163017 | ACTG2 | actin, gamma 2, smooth muscle, enteric |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",2,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[3,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | X_id | name | query | summary | notfound |
|---|---|---|---|---|---|
| MBP | 4155 | myelin basic protein | ENSG00000197971 | The protein encoded by the classic MBP gene is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. However, MBP-related transcripts are also present in the bone marrow and the immune system. These mRNAs arise from the long MBP gene (otherwise called ‘Golli-MBP’) that contains 3 additional exons located upstream of the classic MBP exons. Alternative splicing from the Golli and the MBP transcription start sites gives rise to 2 sets of MBP-related transcripts and gene products. The Golli mRNAs contain 3 exons unique to Golli-MBP, spliced in-frame to 1 or more MBP exons. They encode hybrid proteins that have N-terminal Golli aa sequence linked to MBP aa sequence. The second family of transcripts contain only MBP exons and produce the well characterized myelin basic proteins. This complex gene structure is conserved among species suggesting that the MBP transcription unit is an integral part of the Golli transcription unit and that this arrangement is important for the function and/or regulation of these genes. | NA |
| MYH7 | 4625 | myosin, heavy chain 7, cardiac muscle, beta | ENSG00000092054 | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | NA |
| RP11-862L9.3 | ENSG00000266844 | NA | ENSG00000266844 | NA | NA |
| FTL | 2512 | ferritin, light polypeptide | ENSG00000087086 | This gene encodes the light subunit of the ferritin protein. Ferritin is the major intracellular iron storage protein in prokaryotes and eukaryotes. It is composed of 24 subunits of the heavy and light ferritin chains. Variation in ferritin subunit composition may affect the rates of iron uptake and release in different tissues. A major function of ferritin is the storage of iron in a soluble and nontoxic state. Defects in this light chain ferritin gene are associated with several neurodegenerative diseases and hyperferritinemia-cataract syndrome. This gene has multiple pseudogenes. | NA |
| DES | 1674 | desmin | ENSG00000175084 | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | NA |
| MYH11 | 4629 | myosin, heavy chain 11, smooth muscle | ENSG00000133392 | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | NA |
| GFAP | 2670 | glial fibrillary acidic protein | ENSG00000131095 | This gene encodes one of the major intermediate filament proteins of mature astrocytes. It is used as a marker to distinguish astrocytes from other glial cells during development. Mutations in this gene cause Alexander disease, a rare disorder of astrocytes in the central nervous system. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | NA |
| TPO | 7173 | thyroid peroxidase | ENSG00000115705 | This gene encodes a membrane-bound glycoprotein. The encoded protein acts as an enzyme and plays a central role in thyroid gland function. The protein functions in the iodination of tyrosine residues in thyroglobulin and phenoxy-ester formation between pairs of iodinated tyrosines to generate the thyroid hormones, thyroxine and triiodothyronine. Mutations in this gene are associated with several disorders of thyroid hormonogenesis, including congenital hypothyroidism, congenital goiter, and thyroid hormone organification defect IIA. Multiple transcript variants encoding distinct isoforms have been identified for this gene, but the full-length nature of some variants has not been determined. | NA |
| TG | 7038 | thyroglobulin | ENSG00000042832 | Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. | NA |
| CYP11B1 | 1584 | cytochrome P450 family 11 subfamily B member 1 | ENSG00000160882 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the mitochondrial inner membrane and is involved in the conversion of progesterone to cortisol in the adrenal cortex. Mutations in this gene cause congenital adrenal hyperplasia due to 11-beta-hydroxylase deficiency. Transcript variants encoding different isoforms have been noted for this gene. | NA |
| SERPINA1 | 5265 | serpin family A member 1 | ENSG00000197249 | The protein encoded by this gene is secreted and is a serine protease inhibitor whose targets include elastase, plasmin, thrombin, trypsin, chymotrypsin, and plasminogen activator. Defects in this gene can cause emphysema or liver disease. Several transcript variants encoding the same protein have been found for this gene. | NA |
| IGFBP5 | 3488 | insulin like growth factor binding protein 5 | ENSG00000115461 | NA | NA |
| KRT14 | 3861 | keratin 14 | ENSG00000186847 | This gene encodes a member of the keratin family, the most diverse group of intermediate filaments. This gene product, a type I keratin, is usually found as a heterotetramer with two keratin 5 molecules, a type II keratin. Together they form the cytoskeleton of epithelial cells. Mutations in the genes for these keratins are associated with epidermolysis bullosa simplex. At least one pseudogene has been identified at 17p12-p11. | NA |
| LPAR1 | 1902 | lysophosphatidic acid receptor 1 | ENSG00000198121 | The integral membrane protein encoded by this gene is a lysophosphatidic acid (LPA) receptor from a group known as EDG receptors. These receptors are members of the G protein-coupled receptor superfamily. Utilized by LPA for cell signaling, EDG receptors mediate diverse biologic functions, including proliferation, platelet aggregation, smooth muscle contraction, inhibition of neuroblastoma cell differentiation, chemotaxis, and tumor cell invasion. Two transcript variants encoding the same protein have been identified for this gene | NA |
| CYP17A1 | 1586 | cytochrome P450 family 17 subfamily A member 1 | ENSG00000148795 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum. It has both 17alpha-hydroxylase and 17,20-lyase activities and is a key enzyme in the steroidogenic pathway that produces progestins, mineralocorticoids, glucocorticoids, androgens, and estrogens. Mutations in this gene are associated with isolated steroid-17 alpha-hydroxylase deficiency, 17-alpha-hydroxylase/17,20-lyase deficiency, pseudohermaphroditism, and adrenal hyperplasia. | NA |
| ACTA2 | 59 | actin, alpha 2, smooth muscle, aorta | ENSG00000107796 | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | NA |
| ALB | 213 | albumin | ENSG00000163631 | Albumin is a soluble, monomeric protein which comprises about one-half of the blood serum protein. Albumin functions primarily as a carrier protein for steroids, fatty acids, and thyroid hormones and plays a role in stabilizing extracellular fluid volume. Albumin is a globular unglycosylated serum protein of molecular weight 65,000. Albumin is synthesized in the liver as preproalbumin which has an N-terminal peptide that is removed before the nascent protein is released from the rough endoplasmic reticulum. The product, proalbumin, is in turn cleaved in the Golgi vesicles to produce the secreted albumin. | NA |
| QDPR | 5860 | quinoid dihydropteridine reductase | ENSG00000151552 | This gene encodes the enzyme dihydropteridine reductase, which catalyzes the NADH-mediated reduction of quinonoid dihydrobiopterin. This enzyme is an essential component of the pterin-dependent aromatic amino acid hydroxylating systems. Mutations in this gene resulting in QDPR deficiency include aberrant splicing, amino acid substitutions, insertions, or premature terminations. Dihydropteridine reductase deficiency presents as atypical phenylketonuria due to insufficient production of biopterin, a cofactor for phenylalanine hydroxylase. | NA |
| SAA1 | 6288 | serum amyloid A1 | ENSG00000173432 | This gene encodes a member of the serum amyloid A family of apolipoproteins. The encoded preproprotein is proteolytically processed to generate the mature protein. This protein is a major acute phase protein that is highly expressed in response to inflammation and tissue injury. This protein also plays an important role in HDL metabolism and cholesterol homeostasis. High levels of this protein are associated with chronic inflammatory diseases including atherosclerosis, rheumatoid arthritis, Alzheimer’s disease and Crohn’s disease. This protein may also be a potential biomarker for certain tumors. Alternate splicing results in multiple transcript variants that encode the same protein. A pseudogene of this gene is found on chromosome 11. | NA |
| TNNT2 | 7139 | troponin T2, cardiac type | ENSG00000118194 | The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. Mutations in this gene have been associated with familial hypertrophic cardiomyopathy as well as with dilated cardiomyopathy. Transcripts for this gene undergo alternative splicing that results in many tissue-specific isoforms, however, the full-length nature of some of these variants has not yet been determined. | NA |
| APOD | 347 | apolipoprotein D | ENSG00000189058 | This gene encodes a component of high density lipoprotein that has no marked similarity to other apolipoprotein sequences. It has a high degree of homology to plasma retinol-binding protein and other members of the alpha 2 microglobulin protein superfamily of carrier proteins, also known as lipocalins. This glycoprotein is closely associated with the enzyme lecithin:cholesterol acyltransferase - an enzyme involved in lipoprotein metabolism. | NA |
| RBP4 | 5950 | retinol binding protein 4 | ENSG00000138207 | This protein belongs to the lipocalin family and is the specific carrier for retinol (vitamin A alcohol) in the blood. It delivers retinol from the liver stores to the peripheral tissues. In plasma, the RBP-retinol complex interacts with transthyretin which prevents its loss by filtration through the kidney glomeruli. A deficiency of vitamin A blocks secretion of the binding protein posttranslationally and results in defective delivery and supply to the epidermal cells. | NA |
| HBB | 3043 | hemoglobin subunit beta | ENSG00000244734 | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | NA |
| ACTG2 | 72 | actin, gamma 2, smooth muscle, enteric | ENSG00000163017 | Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | NA |
| ABCA2 | 20 | ATP binding cassette subfamily A member 2 | ENSG00000107331 | The membrane-associated protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intracellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the ABC1 subfamily. Members of the ABC1 subfamily comprise the only major ABC subfamily found exclusively in multicellular eukaryotes. This protein is highly expressed in brain tissue and may play a role in macrophage lipid metabolism and neural development. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| SAA2 | 6289 | serum amyloid A2 | ENSG00000134339 | NA | NA |
| TGM2 | 7052 | transglutaminase 2 | ENSG00000198959 | Transglutaminases are enzymes that catalyze the crosslinking of proteins by epsilon-gamma glutamyl lysine isopeptide bonds. While the primary structure of transglutaminases is not conserved, they all have the same amino acid sequence at their active sites and their activity is calcium-dependent. The protein encoded by this gene acts as a monomer, is induced by retinoic acid, and appears to be involved in apoptosis. Finally, the encoded protein is the autoantigen implicated in celiac disease. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| MAP4 | 4134 | microtubule associated protein 4 | ENSG00000047849 | The protein encoded by this gene is a major non-neuronal microtubule-associated protein. This protein contains a domain similar to the microtubule-binding domains of neuronal microtubule-associated protein (MAP2) and microtubule-associated protein tau (MAPT/TAU). This protein promotes microtubule assembly, and has been shown to counteract destabilization of interphase microtubule catastrophe promotion. Cyclin B was found to interact with this protein, which targets cell division cycle 2 (CDC2) kinase to microtubules. The phosphorylation of this protein affects microtubule properties and cell cycle progression. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| SERPINE1 | 5054 | serpin family E member 1 | ENSG00000106366 | This gene encodes a member of the serine proteinase inhibitor (serpin) superfamily. This member is the principal inhibitor of tissue plasminogen activator (tPA) and urokinase (uPA), and hence is an inhibitor of fibrinolysis. Defects in this gene are the cause of plasminogen activator inhibitor-1 deficiency (PAI-1 deficiency), and high concentrations of the gene product are associated with thrombophilia. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| KRT8 | 3856 | keratin 8 | ENSG00000170421 | This gene is a member of the type II keratin family clustered on the long arm of chromosome 12. Type I and type II keratins heteropolymerize to form intermediate-sized filaments in the cytoplasm of epithelial cells. The product of this gene typically dimerizes with keratin 18 to form an intermediate filament in simple single-layered epithelial cells. This protein plays a role in maintaining cellular structural integrity and also functions in signal transduction and cellular differentiation. Mutations in this gene cause cryptogenic cirrhosis. Alternatively spliced transcript variants have been found for this gene. | NA |
| MYH6 | 4624 | myosin, heavy chain 6, cardiac muscle, alpha | ENSG00000197616 | Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. | NA |
| RP11-394O4.5 | ENSG00000269936 | NA | ENSG00000269936 | NA | NA |
| SPINK5 | 11005 | serine peptidase inhibitor, Kazal type 5 | ENSG00000133710 | This gene encodes a multidomain serine protease inhibitor that contains 15 potential inhibitory domains. The encoded preproprotein is proteolytically processed to generate multiple protein products, which may exhibit unique activities and specificities. These proteins may play a role in skin and hair morphogenesis, as well as anti-inflammatory and antimicrobial protection of mucous epithelia. Mutations in this gene may result in Netherton syndrome, a disorder characterized by ichthyosis, defective cornification, and atopy. This gene is present in a gene cluster on chromosome 5. Alternative splicing results in multiple transcript variants. | NA |
| MPZ | 4359 | myelin protein zero | ENSG00000158887 | This gene is specifically expressed in Schwann cells of the peripheral nervous system and encodes a type I transmembrane glycoprotein that is a major structural protein of the peripheral myelin sheath. The encoded protein contains a large hydrophobic extracellular domain and a smaller basic intracellular domain, which are essential for the formation and stabilization of the multilamellar structure of the compact myelin. Mutations in this gene are associated with autosomal dominant form of Charcot-Marie-Tooth disease type 1 (CMT1B) and other polyneuropathies, such as Dejerine-Sottas syndrome (DSS) and congenital hypomyelinating neuropathy (CHN). A recent study showed that two isoforms are produced from the same mRNA by use of alternative in-frame translation termination codons via a stop codon readthrough mechanism. | NA |
| GLUL | 2752 | glutamate-ammonia ligase | ENSG00000135821 | The protein encoded by this gene belongs to the glutamine synthetase family. It catalyzes the synthesis of glutamine from glutamate and ammonia in an ATP-dependent reaction. This protein plays a role in ammonia and glutamate detoxification, acid-base homeostasis, cell signaling, and cell proliferation. Glutamine is an abundant amino acid, and is important to the biosynthesis of several amino acids, pyrimidines, and purines. Mutations in this gene are associated with congenital glutamine deficiency, and overexpression of this gene was observed in some primary liver cancer samples. There are six pseudogenes of this gene found on chromosomes 2, 5, 9, 11, and 12. Alternative splicing results in multiple transcript variants. | NA |
| SAA2-SAA4 | 100528017 | SAA2-SAA4 readthrough | ENSG00000255071 | This locus represents naturally occurring read-through transcription between the neighboring serum amyloid A2 and serum amyloid A4 genes on chromosome 11. The read-through transcript produces a fusion protein that shares sequence identity with each individual gene product. | NA |
| NDRG2 | 57447 | NDRG family member 2 | ENSG00000165795 | This gene is a member of the N-myc downregulated gene family which belongs to the alpha/beta hydrolase superfamily. The protein encoded by this gene is a cytoplasmic protein that may play a role in neurite outgrowth. This gene may be involved in glioblastoma carcinogenesis. Several alternatively spliced transcript variants of this gene have been described, but the full-length nature of some of these variants has not been determined. | NA |
| ALDOB | 229 | aldolase, fructose-bisphosphate B | ENSG00000136872 | Fructose-1,6-bisphosphate aldolase (EC 4.1.2.13) is a tetrameric glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Vertebrates have 3 aldolase isozymes which are distinguished by their electrophoretic and catalytic properties. Differences indicate that aldolases A, B, and C are distinct proteins, the products of a family of related ‘housekeeping’ genes exhibiting developmentally regulated expression of the different isozymes. The developing embryo produces aldolase A, which is produced in even greater amounts in adult muscle where it can be as much as 5% of total cellular protein. In adult liver, kidney and intestine, aldolase A expression is repressed and aldolase B is produced. In brain and other nervous tissue, aldolase A and C are expressed about equally. There is a high degree of homology between aldolase A and C. Defects in ALDOB cause hereditary fructose intolerance. | NA |
| MYL7 | 58498 | myosin light chain 7 | ENSG00000106631 | NA | NA |
| MFGE8 | 4240 | milk fat globule-EGF factor 8 protein | ENSG00000140545 | This gene encodes a preproprotein that is proteolytically processed to form multiple protein products. The major encoded protein product, lactadherin, is a membrane glycoprotein that promotes phagocytosis of apoptotic cells. This protein has also been implicated in wound healing, autoimmune disease, and cancer. Lactadherin can be further processed to form a smaller cleavage product, medin, which comprises the major protein component of aortic medial amyloid (AMA). Alternative splicing results in multiple transcript variants. | NA |
| HP | 3240 | haptoglobin | ENSG00000257017 | This gene encodes a preproprotein, which is processed to yield both alpha and beta chains, which subsequently combine as a tetramer to produce haptoglobin. Haptoglobin functions to bind free plasma hemoglobin, which allows degradative enzymes to gain access to the hemoglobin, while at the same time preventing loss of iron through the kidneys and protecting the kidneys from damage by hemoglobin. Mutations in this gene and/or its regulatory regions cause ahaptoglobinemia or hypohaptoglobinemia. This gene has also been linked to diabetic nephropathy, the incidence of coronary artery disease in type 1 diabetes, Crohn’s disease, inflammatory disease behavior, primary sclerosing cholangitis, susceptibility to idiopathic Parkinson’s disease, and a reduced incidence of Plasmodium falciparum malaria. The protein encoded also exhibits antimicrobial activity against bacteria. A similar duplicated gene is located next to this gene on chromosome 16. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| MYL2 | 4633 | myosin light chain 2 | ENSG00000111245 | Thus gene encodes the regulatory light chain associated with cardiac myosin beta (or slow) heavy chain. Ca+ triggers the phosphorylation of regulatory light chain that in turn triggers contraction. Mutations in this gene are associated with mid-left ventricular chamber type hypertrophic cardiomyopathy. | NA |
| HBA2 | 3040 | hemoglobin subunit alpha 2 | ENSG00000188536 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | NA |
| FEZ1 | 9638 | fasciculation and elongation protein zeta 1 | ENSG00000149557 | This gene is an ortholog of the C. elegans unc-76 gene, which is necessary for normal axonal bundling and elongation within axon bundles. Expression of this gene in C. elegans unc-76 mutants can restore to the mutants partial locomotion and axonal fasciculation, suggesting that it also functions in axonal outgrowth. The N-terminal half of the gene product is highly acidic. Alternatively spliced transcript variants encoding different isoforms of this protein have been described. | NA |
| FNBP1 | 23048 | formin binding protein 1 | ENSG00000187239 | The protein encoded by this gene is a member of the formin-binding-protein family. The protein contains an N-terminal Fer/Cdc42-interacting protein 4 (CIP4) homology (FCH) domain followed by a coiled-coil domain, a proline-rich motif, a second coiled-coil domain, a Rho family protein-binding domain (RBD), and a C-terminal SH3 domain. This protein binds sorting nexin 2 (SNX2), tankyrase (TNKS), and dynamin; an interaction between this protein and formin has not been demonstrated yet in human. | NA |
| COL1A2 | 1278 | collagen type I alpha 2 chain | ENSG00000164692 | This gene encodes the pro-alpha2 chain of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIB, recessive Ehlers-Danlos syndrome Classical type, idiopathic osteoporosis, and atypical Marfan syndrome. Symptoms associated with mutations in this gene, however, tend to be less severe than mutations in the gene for the alpha1 chain of type I collagen (COL1A1) reflecting the different role of alpha2 chains in matrix integrity. Three transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | NA |
| MTCO1P12 | ENSG00000237973 | MT-CO1 pseudogene 12 | ENSG00000237973 | NA | NA |
| HBA1 | 3039 | hemoglobin subunit alpha 1 | ENSG00000206172 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | NA |
| TPM1 | 7168 | tropomyosin 1 (alpha) | ENSG00000140416 | This gene is a member of the tropomyosin family of highly conserved, widely distributed actin-binding proteins involved in the contractile system of striated and smooth muscles and the cytoskeleton of non-muscle cells. Tropomyosin is composed of two alpha-helical chains arranged as a coiled-coil. It is polymerized end to end along the two grooves of actin filaments and provides stability to the filaments. The encoded protein is one type of alpha helical chain that forms the predominant tropomyosin of striated muscle, where it also functions in association with the troponin complex to regulate the calcium-dependent interaction of actin and myosin during muscle contraction. In smooth muscle and non-muscle cells, alternatively spliced transcript variants encoding a range of isoforms have been described. Mutations in this gene are associated with type 3 familial hypertrophic cardiomyopathy. | NA |
| ZCCHC24 | 219654 | zinc finger CCHC-type containing 24 | ENSG00000165424 | NA | NA |
| ZEB2 | 9839 | zinc finger E-box binding homeobox 2 | ENSG00000169554 | The protein encoded by this gene is a member of the Zfh1 family of 2-handed zinc finger/homeodomain proteins. It is located in the nucleus and functions as a DNA-binding transcriptional repressor that interacts with activated SMADs. Mutations in this gene are associated with Hirschsprung disease/Mowat-Wilson syndrome. Alternatively spliced transcript variants have been found for this gene. | NA |
| C1orf198 | 84886 | chromosome 1 open reading frame 198 | ENSG00000119280 | NA | NA |
| SCARB1 | 949 | scavenger receptor class B member 1 | ENSG00000073060 | The protein encoded by this gene is a plasma membrane receptor for high density lipoprotein cholesterol (HDL). The encoded protein mediates cholesterol transfer to and from HDL. In addition, this protein is a receptor for hepatitis C virus glycoprotein E2. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| NBEAL2 | 23218 | neurobeachin like 2 | ENSG00000160796 | The protein encoded by this gene contains a beige and Chediak-Higashi (BEACH) domain and multiple WD40 domains, and may play a role in megakaryocyte alpha-granule biogenesis. Mutations in this gene are a cause of gray platelet syndrome. | NA |
| NA | NA | NA | ENSG00000117289 | NA | TRUE |
| NBEA | 26960 | neurobeachin | ENSG00000172915 | This gene encodes a member of a large, diverse group of A-kinase anchor proteins that target the activity of protein kinase A to specific subcellular sites by binding to its type II regulatory subunits. Brain-specific expression and coat protein-like membrane recruitment of a highly similar protein in mouse suggest an involvement in neuronal post-Golgi membrane traffic. Mutations in this gene may be associated with a form of autism. This gene and its expression are frequently disrupted in patients with multiple myeloma. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Additional transcript variants may exist, but their full-length nature has not been determined. | NA |
| CRYAB | 1410 | crystallin alpha B | ENSG00000109846 | Mammalian lens crystallins are divided into alpha, beta, and gamma families. Alpha crystallins are composed of two gene products: alpha-A and alpha-B, for acidic and basic, respectively. Alpha crystallins can be induced by heat shock and are members of the small heat shock protein (HSP20) family. They act as molecular chaperones although they do not renature proteins and release them in the fashion of a true chaperone; instead they hold them in large soluble aggregates. Post-translational modifications decrease the ability to chaperone. These heterogeneous aggregates consist of 30-40 subunits; the alpha-A and alpha-B subunits have a 3:1 ratio, respectively. Two additional functions of alpha crystallins are an autokinase activity and participation in the intracellular architecture. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. Alpha-A and alpha-B gene products are differentially expressed; alpha-A is preferentially restricted to the lens and alpha-B is expressed widely in many tissues and organs. Elevated expression of alpha-B crystallin occurs in many neurological diseases; a missense mutation cosegregated in a family with a desmin-related myopathy. Alternative splicing results in multiple transcript variants. | NA |
| PEA15 | 8682 | phosphoprotein enriched in astrocytes 15 | ENSG00000162734 | This gene encodes a death effector domain-containing protein that functions as a negative regulator of apoptosis. The encoded protein is an endogenous substrate for protein kinase C. This protein is also overexpressed in type 2 diabetes mellitus, where it may contribute to insulin resistance in glucose uptake. Alternative splicing results in multiple transcript variants. | NA |
| CCNI | 10983 | cyclin I | ENSG00000118816 | The protein encoded by this gene belongs to the highly conserved cyclin family, whose members are characterized by a dramatic periodicity in protein abundance through the cell cycle. Cyclins function as regulators of CDK kinases. Different cyclins exhibit distinct expression and degradation patterns which contribute to the temporal coordination of each mitotic event. This cyclin shows the highest similarity with cyclin G. The transcript of this gene was found to be expressed constantly during cell cycle progression. The function of this cyclin has not yet been determined. | NA |
| TSC22D4 | 81628 | TSC22 domain family member 4 | ENSG00000166925 | TSC22D4 is a member of the TSC22 domain family of leucine zipper transcriptional regulators (see TSC22D3; MIM 300506) (Kester et al., 1999 [PubMed 10488076]; Fiorenza et al., 2001 [PubMed 11707329]). | NA |
| HIPK2 | 28996 | homeodomain interacting protein kinase 2 | ENSG00000064393 | This gene encodes a conserved serine/threonine kinase that is a member of the homeodomain-interacting protein kinase family. The encoded protein interacts with homeodomain transcription factors and many other transcription factors such as p53, and can function as both a corepressor and a coactivator depending on the transcription factor and its subcellular localization. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| BSG | 682 | basigin (Ok blood group) | ENSG00000172270 | The protein encoded by this gene is a plasma membrane protein that is important in spermatogenesis, embryo implantation, neural network formation, and tumor progression. The encoded protein is also a member of the immunoglobulin superfamily. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| DCN | 1634 | decorin | ENSG00000011465 | This gene encodes a member of the small leucine-rich proteoglycan family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature protein. This protein plays a role in collagen fibril assembly. Binding of this protein to multiple cell surface receptors mediates its role in tumor suppression, including a stimulatory effect on autophagy and inflammation and an inhibitory effect on angiogenesis and tumorigenesis. This gene and the related gene biglycan are thought to be the result of a gene duplication. Mutations in this gene are associated with congenital stromal corneal dystrophy in human patients. | NA |
| ORM1 | 5004 | orosomucoid 1 | ENSG00000229314 | This gene encodes a key acute phase plasma protein. Because of its increase due to acute inflammation, this protein is classified as an acute-phase reactant. The specific function of this protein has not yet been determined; however, it may be involved in aspects of immunosuppression. | NA |
| KRT10 | 3858 | keratin 10 | ENSG00000186395 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | NA |
| LRG1 | 116844 | leucine rich alpha-2-glycoprotein 1 | ENSG00000171236 | The leucine-rich repeat (LRR) family of proteins, including LRG1, have been shown to be involved in protein-protein interaction, signal transduction, and cell adhesion and development. LRG1 is expressed during granulocyte differentiation (O’Donnell et al., 2002 [PubMed 12223515]). | NA |
| AEBP1 | 165 | AE binding protein 1 | ENSG00000106624 | This gene encodes a member of carboxypeptidase A protein family. The encoded protein may function as a transcriptional repressor and play a role in adipogenesis and smooth muscle cell differentiation. Studies in mice suggest that this gene functions in wound healing and abdominal wall development. Overexpression of this gene is associated with glioblastoma. | NA |
| CASQ2 | 845 | calsequestrin 2 | ENSG00000118729 | The protein encoded by this gene specifies the cardiac muscle family member of the calsequestrin family. Calsequestrin is localized to the sarcoplasmic reticulum in cardiac and slow skeletal muscle cells. The protein is a calcium binding protein that stores calcium for muscle function. Mutations in this gene cause stress-induced polymorphic ventricular tachycardia, also referred to as catecholaminergic polymorphic ventricular tachycardia 2 (CPVT2), a disease characterized by bidirectional ventricular tachycardia that may lead to cardiac arrest. | NA |
| PYGB | 5834 | phosphorylase, glycogen; brain | ENSG00000100994 | The protein encoded by this gene is a glycogen phosphorylase found predominantly in the brain. The encoded protein forms homodimers which can associate into homotetramers, the enzymatically active form of glycogen phosphorylase. The activity of this enzyme is positively regulated by AMP and negatively regulated by ATP, ADP, and glucose-6-phosphate. This enzyme catalyzes the rate-determining step in glycogen degradation. | NA |
| MYOM2 | 9172 | myomesin 2 | ENSG00000036448 | The giant protein titin, together with its associated proteins, interconnects the major structure of sarcomeres, the M bands and Z discs. The C-terminal end of the titin string extends into the M line, where it binds tightly to M-band constituents of apparent molecular masses of 190 kD and 165 kD. The predicted MYOM2 protein contains 1,465 amino acids. Like MYOM1, MYOM2 has a unique N-terminal domain followed by 12 repeat domains with strong homology to either fibronectin type III or immunoglobulin C2 domains. Protein sequence comparisons suggested that the MYOM2 protein and bovine M protein are identical. | NA |
| TCAP | 8557 | titin-cap | ENSG00000173991 | Sarcomere assembly is regulated by the muscle protein titin. Titin is a giant elastic protein with kinase activity that extends half the length of a sarcomere. It serves as a scaffold to which myofibrils and other muscle related proteins are attached. This gene encodes a protein found in striated and cardiac muscle that binds to the titin Z1-Z2 domains and is a substrate of titin kinase, interactions thought to be critical to sarcomere assembly. Mutations in this gene are associated with limb-girdle muscular dystrophy type 2G. | NA |
| GAPDH | 2597 | glyceraldehyde-3-phosphate dehydrogenase | ENSG00000111640 | This gene encodes a member of the glyceraldehyde-3-phosphate dehydrogenase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. The product of this gene catalyzes an important energy-yielding step in carbohydrate metabolism, the reversible oxidative phosphorylation of glyceraldehyde-3-phosphate in the presence of inorganic phosphate and nicotinamide adenine dinucleotide (NAD). The encoded protein has additionally been identified to have uracil DNA glycosylase activity in the nucleus. Also, this protein contains a peptide that has antimicrobial activity against E. coli, P. aeruginosa, and C. albicans. Studies of a similar protein in mouse have assigned a variety of additional functions including nitrosylation of nuclear proteins, the regulation of mRNA stability, and acting as a transferrin receptor on the cell surface of macrophage. Many pseudogenes similar to this locus are present in the human genome. Alternative splicing results in multiple transcript variants. | NA |
| MYH1 | 4619 | myosin, heavy chain 1, skeletal muscle, adult | ENSG00000109061 | Myosin is a major contractile protein which converts chemical energy into mechanical energy through the hydrolysis of ATP. Myosin is a hexameric protein composed of a pair of myosin heavy chains (MYH) and two pairs of nonidentical light chains. Myosin heavy chains are encoded by a multigene family. In mammals at least 10 different myosin heavy chain (MYH) isoforms have been described from striated, smooth, and nonmuscle cells. These isoforms show expression that is spatially and temporally regulated during development. | NA |
| AOC3 | 8639 | amine oxidase, copper containing 3 | ENSG00000131471 | This gene encodes a member of the semicarbazide-sensitive amine oxidase family. Copper amine oxidases catalyze the oxidative conversion of amines to aldehydes in the presence of copper and quinone cofactor. The encoded protein is localized to the cell surface, has adhesive properties as well as monoamine oxidase activity, and may be involved in leukocyte trafficking. Alterations in levels of the encoded protein may be associated with many diseases, including diabetes mellitus. A pseudogene of this gene has been described and is located approximately 9-kb downstream on the same chromosome. Alternative splicing results in multiple transcript variants. | NA |
| MAPK8IP1 | 9479 | mitogen-activated protein kinase 8 interacting protein 1 | ENSG00000121653 | This gene encodes a regulator of the pancreatic beta-cell function. It is highly similar to JIP-1, a mouse protein known to be a regulator of c-Jun amino-terminal kinase (Mapk8). This protein has been shown to prevent MAPK8 mediated activation of transcription factors, and to decrease IL-1 beta and MAP kinase kinase 1 (MEKK1) induced apoptosis in pancreatic beta cells. This protein also functions as a DNA-binding transactivator of the glucose transporter GLUT2. RE1-silencing transcription factor (REST) is reported to repress the expression of this gene in insulin-secreting beta cells. This gene is found to be mutated in a type 2 diabetes family, and thus is thought to be a susceptibility gene for type 2 diabetes. | NA |
| EZR | 7430 | ezrin | ENSG00000092820 | The cytoplasmic peripheral membrane protein encoded by this gene functions as a protein-tyrosine kinase substrate in microvilli. As a member of the ERM protein family, this protein serves as an intermediate between the plasma membrane and the actin cytoskeleton. This protein plays a key role in cell surface structure adhesion, migration and organization, and it has been implicated in various human cancers. A pseudogene located on chromosome 3 has been identified for this gene. Alternatively spliced variants have also been described for this gene. | NA |
| F3 | 2152 | coagulation factor III, tissue factor | ENSG00000117525 | This gene encodes coagulation factor III which is a cell surface glycoprotein. This factor enables cells to initiate the blood coagulation cascades, and it functions as the high-affinity receptor for the coagulation factor VII. The resulting complex provides a catalytic event that is responsible for initiation of the coagulation protease cascades by specific limited proteolysis. Unlike the other cofactors of these protease cascades, which circulate as nonfunctional precursors, this factor is a potent initiator that is fully functional when expressed on cell surfaces. There are 3 distinct domains of this factor: extracellular, transmembrane, and cytoplasmic. This protein is the only one in the coagulation pathway for which a congenital deficiency has not been described. Alternate splicing results in multiple transcript variants. | NA |
| PALLD | 23022 | palladin, cytoskeletal associated protein | ENSG00000129116 | This gene encodes a cytoskeletal protein that is required for organizing the actin cytoskeleton. The protein is a component of actin-containing microfilaments, and it is involved in the control of cell shape, adhesion, and contraction. Polymorphisms in this gene are associated with a susceptibility to pancreatic cancer type 1, and also with a risk for myocardial infarction. Alternative splicing results in multiple transcript variants. | NA |
| MB | 4151 | myoglobin | ENSG00000198125 | This gene encodes a member of the globin superfamily and is expressed in skeletal and cardiac muscles. The encoded protein is a haemoprotein contributing to intracellular oxygen storage and transcellular facilitated diffusion of oxygen. At least three alternatively spliced transcript variants encoding the same protein have been reported. | NA |
| ITGB4 | 3691 | integrin subunit beta 4 | ENSG00000132470 | Integrins are heterodimers comprised of alpha and beta subunits, that are noncovalently associated transmembrane glycoprotein receptors. Different combinations of alpha and beta polypeptides form complexes that vary in their ligand-binding specificities. Integrins mediate cell-matrix or cell-cell adhesion, and transduced signals that regulate gene expression and cell growth. This gene encodes the integrin beta 4 subunit, a receptor for the laminins. This subunit tends to associate with alpha 6 subunit and is likely to play a pivotal role in the biology of invasive carcinoma. Mutations in this gene are associated with epidermolysis bullosa with pyloric atresia. Multiple alternatively spliced transcript variants encoding distinct isoforms have been found for this gene. | NA |
| REG1A | 5967 | regenerating family member 1 alpha | ENSG00000115386 | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | NA |
| SYNPO | 11346 | synaptopodin | ENSG00000171992 | Synaptopodin is an actin-associated protein that may play a role in actin-based cell shape and motility. The name synaptopodin derives from the protein’s associations with postsynaptic densities and dendritic spines and with renal podocytes (Mundel et al., 1997 [PubMed 9314539]). | NA |
| HSPB7 | 27129 | heat shock protein family B (small) member 7 | ENSG00000173641 | NA | NA |
| RTKN | 6242 | rhotekin | ENSG00000114993 | This gene encodes a scaffold protein that interacts with GTP-bound Rho proteins. Binding of this protein inhibits the GTPase activity of Rho proteins. This protein may interfere with the conversion of active, GTP-bound Rho to the inactive GDP-bound form by RhoGAP. Rho proteins regulate many important cellular processes, including cytokinesis, transcription, smooth muscle contraction, cell growth and transformation. Dysregulation of the Rho signal transduction pathway has been implicated in many forms of cancer. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| MYL3 | 4634 | myosin light chain 3 | ENSG00000160808 | MYL3 encodes myosin light chain 3, an alkali light chain also referred to in the literature as both the ventricular isoform and the slow skeletal muscle isoform. Mutations in MYL3 have been identified as a cause of mid-left ventricular chamber type hypertrophic cardiomyopathy. | NA |
| PAQR6 | 79957 | progestin and adipoQ receptor family member 6 | ENSG00000160781 | NA | NA |
| FGA | 2243 | fibrinogen alpha chain | ENSG00000171560 | This gene encodes the alpha subunit of the coagulation factor fibrinogen, which is a component of the blood clot. Following vascular injury, the encoded preproprotein is proteolytically processed by thrombin during the conversion of fibrinogen to fibrin. Mutations in this gene lead to several disorders, including dysfibrinogenemia, hypofibrinogenemia, afibrinogenemia and renal amyloidosis. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. | NA |
| HRC | 3270 | histidine rich calcium binding protein | ENSG00000130528 | This gene encodes a luminal sarcoplasmic reticulum protein identified by its ability to bind low-density lipoprotein with high affinity. The protein interacts with the cytoplasmic domain of triadin, the main transmembrane protein of the junctional sarcoplasmic reticulum (SR) of skeletal muscle. The protein functions in the regulation of releasable calcium into the SR. | NA |
| RPL37A | 6168 | ribosomal protein L37a | ENSG00000197756 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L37AE family of ribosomal proteins. It is located in the cytoplasm. The protein contains a C4-type zinc finger-like domain. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | NA |
| CMYA5 | 202333 | cardiomyopathy associated 5 | ENSG00000164309 | NA | NA |
| NRAP | 4892 | nebulin related anchoring protein | ENSG00000197893 | NA | NA |
| ANKRD1 | 27063 | ankyrin repeat domain 1 | ENSG00000148677 | The protein encoded by this gene is localized to the nucleus of endothelial cells and is induced by IL-1 and TNF-alpha stimulation. Studies in rat cardiomyocytes suggest that this gene functions as a transcription factor. Interactions between this protein and the sarcomeric proteins myopalladin and titin suggest that it may also be involved in the myofibrillar stretch-sensor system. | NA |
| A2M | 2 | alpha-2-macroglobulin | ENSG00000175899 | Alpha-2-macroglobulin is a protease inhibitor and cytokine transporter. It inhibits many proteases, including trypsin, thrombin and collagenase. A2M is implicated in Alzheimer disease (AD) due to its ability to mediate the clearance and degradation of A-beta, the major component of beta-amyloid deposits. | NA |
| KIAA0930 | 23313 | KIAA0930 | ENSG00000100364 | NA | NA |
| IGFBP2 | 3485 | insulin like growth factor binding protein 2 | ENSG00000115457 | The protein encoded by this gene is one of six similar proteins that bind insulin-like growth factors I and II (IGF-I and IGF-II). The encoded protein can be secreted into the bloodstream, where it binds IGF-I and IGF-II with high affinity, or it can remain intracellular, interacting with many different ligands. High expression levels of this protein promote the growth of several types of tumors and may be predictive of the chances of recovery of the patient. Several transcript variants, one encoding a secreted isoform and the others encoding nonsecreted isoforms, have been found for this gene. | NA |
| CD36 | 948 | CD36 molecule | ENSG00000135218 | The protein encoded by this gene is the fourth major glycoprotein of the platelet surface and serves as a receptor for thrombospondin in platelets and various cell lines. Since thrombospondins are widely distributed proteins involved in a variety of adhesive processes, this protein may have important functions as a cell adhesion molecule. It binds to collagen, thrombospondin, anionic phospholipids and oxidized LDL. It directly mediates cytoadherence of Plasmodium falciparum parasitized erythrocytes and it binds long chain fatty acids and may function in the transport and/or as a regulator of fatty acid transport. Mutations in this gene cause platelet glycoprotein deficiency. Multiple alternatively spliced transcript variants have been found for this gene. | NA |
| NA | NA | NA | ENSG00000140181 | NA | TRUE |
| CNP | 1267 | 2’,3’-cyclic nucleotide 3’ phosphodiesterase | ENSG00000173786 | NA | NA |
| NA | NA | NA | ENSG00000256545 | NA | TRUE |
| REG1B | 5968 | regenerating family member 1 beta | ENSG00000172023 | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV based on the primary structures of the encoded proteins. This gene encodes a protein secreted by the exocrine pancreas that is highly similar to the REG1A protein. The related REG1A protein is associated with islet cell regeneration and diabetogenesis, and may be involved in pancreatic lithogenesis. Reg family members REG1A, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",3,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[4,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | X_id | summary | query | name |
|---|---|---|---|---|
| SAA1 | 6288 | This gene encodes a member of the serum amyloid A family of apolipoproteins. The encoded preproprotein is proteolytically processed to generate the mature protein. This protein is a major acute phase protein that is highly expressed in response to inflammation and tissue injury. This protein also plays an important role in HDL metabolism and cholesterol homeostasis. High levels of this protein are associated with chronic inflammatory diseases including atherosclerosis, rheumatoid arthritis, Alzheimer’s disease and Crohn’s disease. This protein may also be a potential biomarker for certain tumors. Alternate splicing results in multiple transcript variants that encode the same protein. A pseudogene of this gene is found on chromosome 11. | ENSG00000173432 | serum amyloid A1 |
| MYH11 | 4629 | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | ENSG00000133392 | myosin, heavy chain 11, smooth muscle |
| ACTG1 | 71 | Actins are highly conserved proteins that are involved in various types of cell motility, and maintenance of the cytoskeleton. In vertebrates, three main groups of actin isoforms, alpha, beta and gamma have been identified. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton, and as mediators of internal cell motility. Actin, gamma 1, encoded by this gene, is a cytoplasmic actin found in non-muscle cells. Mutations in this gene are associated with DFNA20/26, a subtype of autosomal dominant non-syndromic sensorineural progressive hearing loss. Alternative splicing results in multiple transcript variants. | ENSG00000184009 | actin gamma 1 |
| KRT10 | 3858 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | ENSG00000186395 | keratin 10 |
| FABP4 | 2167 | FABP4 encodes the fatty acid binding protein found in adipocytes. Fatty acid binding proteins are a family of small, highly conserved, cytoplasmic proteins that bind long-chain fatty acids and other hydrophobic ligands. It is thought that FABPs roles include fatty acid uptake, transport, and metabolism. | ENSG00000170323 | fatty acid binding protein 4 |
| CD36 | 948 | The protein encoded by this gene is the fourth major glycoprotein of the platelet surface and serves as a receptor for thrombospondin in platelets and various cell lines. Since thrombospondins are widely distributed proteins involved in a variety of adhesive processes, this protein may have important functions as a cell adhesion molecule. It binds to collagen, thrombospondin, anionic phospholipids and oxidized LDL. It directly mediates cytoadherence of Plasmodium falciparum parasitized erythrocytes and it binds long chain fatty acids and may function in the transport and/or as a regulator of fatty acid transport. Mutations in this gene cause platelet glycoprotein deficiency. Multiple alternatively spliced transcript variants have been found for this gene. | ENSG00000135218 | CD36 molecule |
| PKM | 5315 | This gene encodes a protein involved in glycolysis. The encoded protein is a pyruvate kinase that catalyzes the transfer of a phosphoryl group from phosphoenolpyruvate to ADP, generating ATP and pyruvate. This protein has been shown to interact with thyroid hormone and may mediate cellular metabolic effects induced by thyroid hormones. This protein has been found to bind Opa protein, a bacterial outer membrane protein involved in gonococcal adherence to and invasion of human cells, suggesting a role of this protein in bacterial pathogenesis. Several alternatively spliced transcript variants encoding a few distinct isoforms have been reported. | ENSG00000067225 | pyruvate kinase, muscle |
| ACSL1 | 2180 | The protein encoded by this gene is an isozyme of the long-chain fatty-acid-coenzyme A ligase family. Although differing in substrate specificity, subcellular localization, and tissue distribution, all isozymes of this family convert free long-chain fatty acids into fatty acyl-CoA esters, and thereby play a key role in lipid biosynthesis and fatty acid degradation. Several transcript variants encoding different isoforms have been found for this gene. | ENSG00000151726 | acyl-CoA synthetase long-chain family member 1 |
| GAPDH | 2597 | This gene encodes a member of the glyceraldehyde-3-phosphate dehydrogenase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. The product of this gene catalyzes an important energy-yielding step in carbohydrate metabolism, the reversible oxidative phosphorylation of glyceraldehyde-3-phosphate in the presence of inorganic phosphate and nicotinamide adenine dinucleotide (NAD). The encoded protein has additionally been identified to have uracil DNA glycosylase activity in the nucleus. Also, this protein contains a peptide that has antimicrobial activity against E. coli, P. aeruginosa, and C. albicans. Studies of a similar protein in mouse have assigned a variety of additional functions including nitrosylation of nuclear proteins, the regulation of mRNA stability, and acting as a transferrin receptor on the cell surface of macrophage. Many pseudogenes similar to this locus are present in the human genome. Alternative splicing results in multiple transcript variants. | ENSG00000111640 | glyceraldehyde-3-phosphate dehydrogenase |
| RBP4 | 5950 | This protein belongs to the lipocalin family and is the specific carrier for retinol (vitamin A alcohol) in the blood. It delivers retinol from the liver stores to the peripheral tissues. In plasma, the RBP-retinol complex interacts with transthyretin which prevents its loss by filtration through the kidney glomeruli. A deficiency of vitamin A blocks secretion of the binding protein posttranslationally and results in defective delivery and supply to the epidermal cells. | ENSG00000138207 | retinol binding protein 4 |
| THBS1 | 7057 | The protein encoded by this gene is a subunit of a disulfide-linked homotrimeric protein. This protein is an adhesive glycoprotein that mediates cell-to-cell and cell-to-matrix interactions. This protein can bind to fibrinogen, fibronectin, laminin, type V collagen and integrins alpha-V/beta-1. This protein has been shown to play roles in platelet aggregation, angiogenesis, and tumorigenesis. | ENSG00000137801 | thrombospondin 1 |
| ACTN4 | 81 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a nonmuscle, alpha actinin isoform which is concentrated in the cytoplasm, and thought to be involved in metastatic processes. Mutations in this gene have been associated with focal and segmental glomerulosclerosis. | ENSG00000130402 | actinin alpha 4 |
| PLIN2 | 123 | The protein encoded by this gene belongs to the perilipin family, members of which coat intracellular lipid storage droplets. This protein is associated with the lipid globule surface membrane material, and maybe involved in development and maintenance of adipose tissue. However, it is not restricted to adipocytes as previously thought, but is found in a wide range of cultured cell lines, including fibroblasts, endothelial and epithelial cells, and tissues, such as lactating mammary gland, adrenal cortex, Sertoli and Leydig cells, and hepatocytes in alcoholic liver cirrhosis, suggesting that it may serve as a marker of lipid accumulation in diverse cell types and diseases. Alternatively spliced transcript variants have been found for this gene. | ENSG00000147872 | perilipin 2 |
| ADH1B | 125 | The protein encoded by this gene is a member of the alcohol dehydrogenase family. Members of this enzyme family metabolize a wide variety of substrates, including ethanol, retinol, other aliphatic alcohols, hydroxysteroids, and lipid peroxidation products. This encoded protein, consisting of several homo- and heterodimers of alpha, beta, and gamma subunits, exhibits high activity for ethanol oxidation and plays a major role in ethanol catabolism. Three genes encoding alpha, beta and gamma subunits are tandemly organized in a genomic segment as a gene cluster. Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000196616 | alcohol dehydrogenase 1B (class I), beta polypeptide |
| PYGM | 5837 | This gene encodes a muscle enzyme involved in glycogenolysis. Highly similar enzymes encoded by different genes are found in liver and brain. Mutations in this gene are associated with McArdle disease (myophosphorylase deficiency), a glycogen storage disease of muscle. Alternative splicing results in multiple transcript variants. | ENSG00000068976 | phosphorylase, glycogen, muscle |
| ALDOA | 226 | The protein encoded by this gene, Aldolase A (fructose-bisphosphate aldolase), is a glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Three aldolase isozymes (A, B, and C), encoded by three different genes, are differentially expressed during development. Aldolase A is found in the developing embryo and is produced in even greater amounts in adult muscle. Aldolase A expression is repressed in adult liver, kidney and intestine and similar to aldolase C levels in brain and other nervous tissue. Aldolase A deficiency has been associated with myopathy and hemolytic anemia. Alternative splicing and alternative promoter usage results in multiple transcript variants. Related pseudogenes have been identified on chromosomes 3 and 10. | ENSG00000149925 | aldolase, fructose-bisphosphate A |
| REG1A | 5967 | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | ENSG00000115386 | regenerating family member 1 alpha |
| ACTB | 60 | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | ENSG00000075624 | actin, beta |
| MYH9 | 4627 | This gene encodes a conventional non-muscle myosin; this protein should not be confused with the unconventional myosin-9a or 9b (MYO9A or MYO9B). The encoded protein is a myosin IIA heavy chain that contains an IQ domain and a myosin head-like domain which is involved in several important functions, including cytokinesis, cell motility and maintenance of cell shape. Defects in this gene have been associated with non-syndromic sensorineural deafness autosomal dominant type 17, Epstein syndrome, Alport syndrome with macrothrombocytopenia, Sebastian syndrome, Fechtner syndrome and macrothrombocytopenia with progressive sensorineural deafness. | ENSG00000100345 | myosin, heavy chain 9, non-muscle |
| GP2 | 2813 | This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. | ENSG00000169347 | glycoprotein 2 |
| ACTA2 | 59 | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | ENSG00000107796 | actin, alpha 2, smooth muscle, aorta |
| GRINA | 2907 | NA | ENSG00000178719 | glutamate ionotropic receptor NMDA type subunit associated protein 1 |
| LPL | 4023 | LPL encodes lipoprotein lipase, which is expressed in heart, muscle, and adipose tissue. LPL functions as a homodimer, and has the dual functions of triglyceride hydrolase and ligand/bridging factor for receptor-mediated lipoprotein uptake. Severe mutations that cause LPL deficiency result in type I hyperlipoproteinemia, while less extreme mutations in LPL are linked to many disorders of lipoprotein metabolism. | ENSG00000175445 | lipoprotein lipase |
| S100A9 | 6280 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and altered expression of this protein is associated with the disease cystic fibrosis. This antimicrobial protein exhibits antifungal and antibacterial activity. | ENSG00000163220 | S100 calcium binding protein A9 |
| ECHDC2 | 55268 | NA | ENSG00000121310 | enoyl-CoA hydratase domain containing 2 |
| HBA2 | 3040 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | ENSG00000188536 | hemoglobin subunit alpha 2 |
| MYH2 | 4620 | Myosins are actin-based motor proteins that function in the generation of mechanical force in eukaryotic cells. Muscle myosins are heterohexamers composed of 2 myosin heavy chains and 2 pairs of nonidentical myosin light chains. This gene encodes a member of the class II or conventional myosin heavy chains, and functions in skeletal muscle contraction. This gene is found in a cluster of myosin heavy chain genes on chromosome 17. A mutation in this gene results in inclusion body myopathy-3. Multiple alternatively spliced variants, encoding the same protein, have been identified. | ENSG00000125414 | myosin, heavy chain 2, skeletal muscle, adult |
| C1S | 716 | This gene encodes a serine protease, which is a major constituent of the human complement subcomponent C1. C1s associates with two other complement components C1r and C1q in order to yield the first component of the serum complement system. Defects in this gene are the cause of selective C1s deficiency. | ENSG00000182326 | complement component 1, s subcomponent |
| MYBPC1 | 4604 | This gene encodes a member of the myosin-binding protein C family. Myosin-binding protein C family members are myosin-associated proteins found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The encoded protein is the slow skeletal muscle isoform of myosin-binding protein C and plays an important role in muscle contraction by recruiting muscle-type creatine kinase to myosin filaments. Mutations in this gene are associated with distal arthrogryposis type I. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | ENSG00000196091 | myosin binding protein C, slow type |
| CPA1 | 1357 | This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. | ENSG00000091704 | carboxypeptidase A1 |
| ACADVL | 37 | The protein encoded by this gene is targeted to the inner mitochondrial membrane where it catalyzes the first step of the mitochondrial fatty acid beta-oxidation pathway. This acyl-Coenzyme A dehydrogenase is specific to long-chain and very-long-chain fatty acids. A deficiency in this gene product reduces myocardial fatty acid beta-oxidation and is associated with cardiomyopathy. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000072778 | acyl-CoA dehydrogenase, very long chain |
| GLUL | 2752 | The protein encoded by this gene belongs to the glutamine synthetase family. It catalyzes the synthesis of glutamine from glutamate and ammonia in an ATP-dependent reaction. This protein plays a role in ammonia and glutamate detoxification, acid-base homeostasis, cell signaling, and cell proliferation. Glutamine is an abundant amino acid, and is important to the biosynthesis of several amino acids, pyrimidines, and purines. Mutations in this gene are associated with congenital glutamine deficiency, and overexpression of this gene was observed in some primary liver cancer samples. There are six pseudogenes of this gene found on chromosomes 2, 5, 9, 11, and 12. Alternative splicing results in multiple transcript variants. | ENSG00000135821 | glutamate-ammonia ligase |
| PRSS1 | 5644 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | ENSG00000204983 | protease, serine 1 |
| CCNL2 | 81669 | The protein encoded by this gene belongs to the cyclin family. Through its interaction with several proteins, such as RNA polymerase II, splicing factors, and cyclin-dependent kinases, this protein functions as a regulator of the pre-mRNA splicing process, as well as in inducing apoptosis by modulating the expression of apoptotic and antiapoptotic proteins. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | ENSG00000221978 | cyclin L2 |
| STAB1 | 23166 | This gene encodes a large, transmembrane receptor protein which may function in angiogenesis, lymphocyte homing, cell adhesion, or receptor scavenging. The protein contains 7 fasciclin, 16 epidermal growth factor (EGF)-like, and 2 laminin-type EGF-like domains as well as a C-type lectin-like hyaluronan-binding Link module. The protein is primarily expressed on sinusoidal endothelial cells of liver, spleen, and lymph node. The receptor has been shown to endocytose ligands such as low density lipoprotein, Gram-positive and Gram-negative bacteria, and advanced glycosylation end products. Supporting its possible role as a scavenger receptor, the protein rapidly cycles between the plasma membrane and early endosomes. | ENSG00000010327 | stabilin 1 |
| SERPINF1 | 5176 | The protein encoded by this gene is a member of the serpin family, although it does not display the serine protease inhibitory activity shown by many of the other serpin family members. The encoded protein is secreted and strongly inhibits angiogenesis. In addition, this protein is a neurotrophic factor involved in neuronal differentiation in retinoblastoma cells. | ENSG00000132386 | serpin family F member 1 |
| MYH1 | 4619 | Myosin is a major contractile protein which converts chemical energy into mechanical energy through the hydrolysis of ATP. Myosin is a hexameric protein composed of a pair of myosin heavy chains (MYH) and two pairs of nonidentical light chains. Myosin heavy chains are encoded by a multigene family. In mammals at least 10 different myosin heavy chain (MYH) isoforms have been described from striated, smooth, and nonmuscle cells. These isoforms show expression that is spatially and temporally regulated during development. | ENSG00000109061 | myosin, heavy chain 1, skeletal muscle, adult |
| MYLK | 4638 | This gene, a muscle member of the immunoglobulin gene superfamily, encodes myosin light chain kinase which is a calcium/calmodulin dependent enzyme. This kinase phosphorylates myosin regulatory light chains to facilitate myosin interaction with actin filaments to produce contractile activity. This gene encodes both smooth muscle and nonmuscle isoforms. In addition, using a separate promoter in an intron in the 3’ region, it encodes telokin, a small protein identical in sequence to the C-terminus of myosin light chain kinase, that is independently expressed in smooth muscle and functions to stabilize unphosphorylated myosin filaments. A pseudogene is located on the p arm of chromosome 3. Four transcript variants that produce four isoforms of the calcium/calmodulin dependent enzyme have been identified as well as two transcripts that produce two isoforms of telokin. Additional variants have been identified but lack full length transcripts. | ENSG00000065534 | myosin light chain kinase |
| CEL | 1056 | The protein encoded by this gene is a glycoprotein secreted from the pancreas into the digestive tract and from the lactating mammary gland into human milk. The physiological role of this protein is in cholesterol and lipid-soluble vitamin ester hydrolysis and absorption. This encoded protein promotes large chylomicron production in the intestine. Also its presence in plasma suggests its interactions with cholesterol and oxidized lipoproteins to modulate the progression of atherosclerosis. In pancreatic tumoral cells, this encoded protein is thought to be sequestrated within the Golgi compartment and is probably not secreted. This gene contains a variable number of tandem repeat (VNTR) polymorphism in the coding region that may influence the function of the encoded protein. | ENSG00000170835 | carboxyl ester lipase |
| SAA2 | 6289 | NA | ENSG00000134339 | serum amyloid A2 |
| PLA2G2A | 5320 | The protein encoded by this gene is a member of the phospholipase A2 family (PLA2). PLA2s constitute a diverse family of enzymes with respect to sequence, function, localization, and divalent cation requirements. This gene product belongs to group II, which contains secreted form of PLA2, an extracellular enzyme that has a low molecular mass and requires calcium ions for catalysis. It catalyzes the hydrolysis of the sn-2 fatty acid acyl ester bond of phosphoglycerides, releasing free fatty acids and lysophospholipids, and thought to participate in the regulation of the phospholipid metabolism in biomembranes. Several alternatively spliced transcript variants with different 5’ UTRs have been found for this gene. | ENSG00000188257 | phospholipase A2 group IIA |
| APMAP | 57136 | NA | ENSG00000101474 | adipocyte plasma membrane associated protein |
| PLN | 5350 | The protein encoded by this gene is found as a pentamer and is a major substrate for the cAMP-dependent protein kinase in cardiac muscle. The encoded protein is an inhibitor of cardiac muscle sarcoplasmic reticulum Ca(2+)-ATPase in the unphosphorylated state, but inhibition is relieved upon phosphorylation of the protein. The subsequent activation of the Ca(2+) pump leads to enhanced muscle relaxation rates, thereby contributing to the inotropic response elicited in heart by beta-agonists. The encoded protein is a key regulator of cardiac diastolic function. Mutations in this gene are a cause of inherited human dilated cardiomyopathy with refractory congestive heart failure, and also familial hypertrophic cardiomyopathy. | ENSG00000198523 | phospholamban |
| CPB1 | 1360 | Three different procarboxypeptidases A and two different procarboxypeptidases B have been isolated. The B1 and B2 forms differ from each other mainly in isoelectric point. Carboxypeptidase B1 is a highly tissue-specific protein and is a useful serum marker for acute pancreatitis and dysfunction of pancreatic transplants. It is not elevated in pancreatic carcinoma. | ENSG00000153002 | carboxypeptidase B1 |
| COL6A1 | 1291 | The collagens are a superfamily of proteins that play a role in maintaining the integrity of various tissues. Collagens are extracellular matrix proteins and have a triple-helical domain as their common structural element. Collagen VI is a major structural component of microfibrils. The basic structural unit of collagen VI is a heterotrimer of the alpha1(VI), alpha2(VI), and alpha3(VI) chains. The alpha2(VI) and alpha3(VI) chains are encoded by the COL6A2 and COL6A3 genes, respectively. The protein encoded by this gene is the alpha 1 subunit of type VI collagen (alpha1(VI) chain). Mutations in the genes that code for the collagen VI subunits result in the autosomal dominant disorder, Bethlem myopathy. | ENSG00000142156 | collagen type VI alpha 1 |
| OAF | 220323 | NA | ENSG00000184232 | out at first homolog |
| TTN | 7273 | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | ENSG00000155657 | titin |
| KRT1 | 3848 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | ENSG00000167768 | keratin 1 |
| ACTN1 | 87 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a nonmuscle, cytoskeletal, alpha actinin isoform and maps to the same site as the structurally similar erythroid beta spectrin gene. Three transcript variants encoding different isoforms have been found for this gene. | ENSG00000072110 | actinin alpha 1 |
| DGAT2 | 84649 | This gene encodes one of two enzymes which catalyzes the final reaction in the synthesis of triglycerides in which diacylglycerol is covalently bound to long chain fatty acyl-CoAs. The encoded protein catalyzes this reaction at low concentrations of magnesium chloride while the other enzyme has high activity at high concentrations of magnesium chloride. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000062282 | diacylglycerol O-acyltransferase 2 |
| IQGAP1 | 8826 | This gene encodes a member of the IQGAP family. The protein contains four IQ domains, one calponin homology domain, one Ras-GAP domain and one WW domain. It interacts with components of the cytoskeleton, with cell adhesion molecules, and with several signaling molecules to regulate cell morphology and motility. Expression of the protein is upregulated by gene amplification in two gastric cancer cell lines. | ENSG00000140575 | IQ motif containing GTPase activating protein 1 |
| MEF2C | 4208 | This locus encodes a member of the MADS box transcription enhancer factor 2 (MEF2) family of proteins, which play a role in myogenesis. The encoded protein, MEF2 polypeptide C, has both trans-activating and DNA binding activities. This protein may play a role in maintaining the differentiated state of muscle cells. Mutations and deletions at this locus have been associated with severe mental retardation, stereotypic movements, epilepsy, and cerebral malformation. Alternatively spliced transcript variants have been described. | ENSG00000081189 | myocyte enhancer factor 2C |
| NNMT | 4837 | N-methylation is one method by which drug and other xenobiotic compounds are metabolized by the liver. This gene encodes the protein responsible for this enzymatic activity which uses S-adenosyl methionine as the methyl donor. | ENSG00000166741 | nicotinamide N-methyltransferase |
| RGS5 | 8490 | This gene encodes a member of the regulators of G protein signaling (RGS) family. The RGS proteins are signal transduction molecules which are involved in the regulation of heterotrimeric G proteins by acting as GTPase activators. This gene is a hypoxia-inducible factor-1 dependent, hypoxia-induced gene which is involved in the induction of endothelial apoptosis. This gene is also one of three genes on chromosome 1q contributing to elevated blood pressure. Alternatively spliced transcript variants have been identified. | ENSG00000143248 | regulator of G-protein signaling 5 |
| CELA3A | 10136 | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. | ENSG00000142789 | chymotrypsin like elastase family member 3A |
| SPINT1 | 6692 | The protein encoded by this gene is a member of the Kunitz family of serine protease inhibitors. The protein is a potent inhibitor specific for HGF activator and is thought to be involved in the regulation of the proteolytic activation of HGF in injured tissues. Alternative splicing results in multiple variants encoding different isoforms. | ENSG00000166145 | serine peptidase inhibitor, Kunitz type 1 |
| LYZ | 4069 | This gene encodes human lysozyme, whose natural substrate is the bacterial cell wall peptidoglycan (cleaving the beta[1-4]glycosidic linkages between N-acetylmuramic acid and N-acetylglucosamine). Lysozyme is one of the antimicrobial agents found in human milk, and is also present in spleen, lung, kidney, white blood cells, plasma, saliva, and tears. The protein has antibacterial activity against a number of bacterial species. Missense mutations in this gene have been identified in heritable renal amyloidosis. | ENSG00000090382 | lysozyme |
| SYNM | 23336 | The protein encoded by this gene is an intermediate filament (IF) family member. IF proteins are cytoskeletal proteins that confer resistance to mechanical stress and are encoded by a dispersed multigene family. This protein has been found to form a linkage between desmin, which is a subunit of the IF network, and the extracellular matrix, and provides an important structural support in muscle. Two alternatively spliced variants encoding different isoforms have been described for this gene. | ENSG00000182253 | synemin |
| NEB | 4703 | This gene encodes nebulin, a giant protein component of the cytoskeletal matrix that coexists with the thick and thin filaments within the sarcomeres of skeletal muscle. In most vertebrates, nebulin accounts for 3 to 4% of the total myofibrillar protein. The encoded protein contains approximately 30-amino acid long modules that can be classified into 7 types and other repeated modules. Protein isoform sizes vary from 600 to 800 kD due to alternative splicing that is tissue-, species-,and developmental stage-specific. Of the 183 exons in the nebulin gene, at least 43 are alternatively spliced, although exons 143 and 144 are not found in the same transcript. Of the several thousand transcript variants predicted for nebulin, the RefSeq Project has decided to create three representative RefSeq records. Mutations in this gene are associated with recessive nemaline myopathy. | ENSG00000183091 | nebulin |
| MYBPC3 | 4607 | MYBPC3 encodes the cardiac isoform of myosin-binding protein C. Myosin-binding protein C is a myosin-associated protein found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. MYBPC3, the cardiac isoform, is expressed exclussively in heart muscle. Regulatory phosphorylation of the cardiac isoform in vivo by cAMP-dependent protein kinase (PKA) upon adrenergic stimulation may be linked to modulation of cardiac contraction. Mutations in MYBPC3 are one cause of familial hypertrophic cardiomyopathy. | ENSG00000134571 | myosin binding protein C, cardiac |
| CNN1 | 1264 | NA | ENSG00000130176 | calponin 1 |
| HYAL1 | 3373 | This gene encodes a lysosomal hyaluronidase. Hyaluronidases intracellularly degrade hyaluronan, one of the major glycosaminoglycans of the extracellular matrix. Hyaluronan is thought to be involved in cell proliferation, migration and differentiation. This enzyme is active at an acidic pH and is the major hyaluronidase in plasma. Mutations in this gene are associated with mucopolysaccharidosis type IX, or hyaluronidase deficiency. The gene is one of several related genes in a region of chromosome 3p21.3 associated with tumor suppression. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000114378 | hyaluronoglucosaminidase 1 |
| CALM3 | 808 | NA | ENSG00000160014 | calmodulin 3 (phosphorylase kinase, delta) |
| CALM2 | 805 | This gene is a member of the calmodulin gene family. There are three distinct calmodulin genes dispersed throughout the genome that encode the identical protein, but differ at the nucleotide level. Calmodulin is a calcium binding protein that plays a role in signaling pathways, cell cycle progression and proliferation. Several infants with severe forms of long-QT syndrome (LQTS) who displayed life-threatening ventricular arrhythmias together with delayed neurodevelopment and epilepsy were found to have mutations in either this gene or another member of the calmodulin gene family (PMID:23388215). Mutations in this gene have also been identified in patients with less severe forms of LQTS (PMID:24917665), while mutations in another calmodulin gene family member have been associated with catecholaminergic polymorphic ventricular tachycardia (CPVT)(PMID:23040497), a rare disorder thought to be the cause of a significant fraction of sudden cardiac deaths in young individuals. Pseudogenes of this gene are found on chromosomes 10, 13, and 17. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000160014 | calmodulin 2 (phosphorylase kinase, delta) |
| REG1B | 5968 | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV based on the primary structures of the encoded proteins. This gene encodes a protein secreted by the exocrine pancreas that is highly similar to the REG1A protein. The related REG1A protein is associated with islet cell regeneration and diabetogenesis, and may be involved in pancreatic lithogenesis. Reg family members REG1A, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | ENSG00000172023 | regenerating family member 1 beta |
| G0S2 | 50486 | NA | ENSG00000123689 | G0/G1 switch 2 |
| C12orf75 | 387882 | NA | ENSG00000235162 | chromosome 12 open reading frame 75 |
| TNNT2 | 7139 | The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. Mutations in this gene have been associated with familial hypertrophic cardiomyopathy as well as with dilated cardiomyopathy. Transcripts for this gene undergo alternative splicing that results in many tissue-specific isoforms, however, the full-length nature of some of these variants has not yet been determined. | ENSG00000118194 | troponin T2, cardiac type |
| FBN1 | 2200 | This gene encodes a member of the fibrillin family of proteins. The encoded preproprotein is proteolytically processed to generate two proteins including the extracellular matrix component fibrillin-1 and the protein hormone asprosin. Fibrillin-1 is an extracellular matrix glycoprotein that serves as a structural component of calcium-binding microfibrils. These microfibrils provide force-bearing structural support in elastic and nonelastic connective tissue throughout the body. Asprosin, secreted by white adipose tissue, has been shown to regulate glucose homeostasis. Mutations in this gene are associated with Marfan syndrome and the related MASS phenotype, as well as ectopia lentis syndrome, Weill-Marchesani syndrome, Shprintzen-Goldberg syndrome and neonatal progeroid syndrome. | ENSG00000166147 | fibrillin 1 |
| AHNAK | 79026 | NA | ENSG00000124942 | AHNAK nucleoprotein |
| KRT2 | 3849 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | ENSG00000172867 | keratin 2 |
| MYOZ1 | 58529 | The protein encoded by this gene is primarily expressed in the skeletal muscle, and belongs to the myozenin family. Members of this family function as calcineurin-interacting proteins that help tether calcineurin to the sarcomere of cardiac and skeletal muscle. They play an important role in modulation of calcineurin signaling. | ENSG00000177791 | myozenin 1 |
| SAA2-SAA4 | 100528017 | This locus represents naturally occurring read-through transcription between the neighboring serum amyloid A2 and serum amyloid A4 genes on chromosome 11. The read-through transcript produces a fusion protein that shares sequence identity with each individual gene product. | ENSG00000255071 | SAA2-SAA4 readthrough |
| RASD1 | 51655 | This gene encodes a member of the Ras superfamily of small GTPases and is induced by dexamethasone. The encoded protein is an activator of G-protein signaling and acts as a direct nucleotide exchange factor for Gi-Go proteins. This protein interacts with the neuronal nitric oxide adaptor protein CAPON, and a nuclear adaptor protein FE65, which interacts with the Alzheimer’s disease amyloid precursor protein. This gene may play a role in dexamethasone-induced alterations in cell morphology, growth and cell-extracellular matrix interactions. Epigenetic inactivation of this gene is closely correlated with resistance to dexamethasone in multiple myeloma cells. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ENSG00000108551 | ras related dexamethasone induced 1 |
| ENO1 | 2023 | This gene encodes alpha-enolase, one of three enolase isoenzymes found in mammals. Each isoenzyme is a homodimer composed of 2 alpha, 2 gamma, or 2 beta subunits, and functions as a glycolytic enzyme. Alpha-enolase in addition, functions as a structural lens protein (tau-crystallin) in the monomeric form. Alternative splicing of this gene results in a shorter isoform that has been shown to bind to the c-myc promoter and function as a tumor suppressor. Several pseudogenes have been identified, including one on the long arm of chromosome 1. Alpha-enolase has also been identified as an autoantigen in Hashimoto encephalopathy. | ENSG00000074800 | enolase 1 |
| ANXA1 | 301 | This gene encodes a membrane-localized protein that binds phospholipids. This protein inhibits phospholipase A2 and has anti-inflammatory activity. Loss of function or expression of this gene has been detected in multiple tumors. | ENSG00000135046 | annexin A1 |
| YWHAZ | 7534 | This gene product belongs to the 14-3-3 family of proteins which mediate signal transduction by binding to phosphoserine-containing proteins. This highly conserved protein family is found in both plants and mammals, and this protein is 99% identical to the mouse, rat and sheep orthologs. The encoded protein interacts with IRS1 protein, suggesting a role in regulating insulin sensitivity. Several transcript variants that differ in the 5’ UTR but that encode the same protein have been identified for this gene. | ENSG00000164924 | tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein zeta |
| TG | 7038 | Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. | ENSG00000042832 | thyroglobulin |
| DAB2 | 1601 | This gene encodes a mitogen-responsive phosphoprotein. It is expressed in normal ovarian epithelial cells, but is down-regulated or absent from ovarian carcinoma cell lines, suggesting its role as a tumor suppressor. This protein binds to the SH3 domains of GRB2, an adaptor protein that couples tyrosine kinase receptors to SOS (a guanine nucleotide exchange factor for Ras), via its C-terminal proline-rich sequences, and may thus modulate growth factor/Ras pathways by competing with SOS for binding to GRB2. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ENSG00000153071 | DAB2, clathrin adaptor protein |
| CSDE1 | 7812 | NA | ENSG00000009307 | cold shock domain containing E1 |
| TTC9 | 23508 | This gene encodes a protein that contains three tetratricopeptide repeats. The gene has been shown to be hormonally regulated in breast cancer cells and may play a role in cancer cell invasion and metastasis. | ENSG00000133985 | tetratricopeptide repeat domain 9 |
| ATP2A1 | 487 | This gene encodes one of the SERCA Ca(2+)-ATPases, which are intracellular pumps located in the sarcoplasmic or endoplasmic reticula of muscle cells. This enzyme catalyzes the hydrolysis of ATP coupled with the translocation of calcium from the cytosol to the sarcoplasmic reticulum lumen, and is involved in muscular excitation and contraction. Mutations in this gene cause some autosomal recessive forms of Brody disease, characterized by increasing impairment of muscular relaxation during exercise. Alternative splicing results in three transcript variants encoding different isoforms. | ENSG00000196296 | ATPase sarcoplasmic/endoplasmic reticulum Ca2+ transporting 1 |
| TLE2 | 7089 | NA | ENSG00000065717 | transducin like enhancer of split 2 |
| HBA1 | 3039 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | ENSG00000206172 | hemoglobin subunit alpha 1 |
| ENG | 2022 | This gene encodes a homodimeric transmembrane protein which is a major glycoprotein of the vascular endothelium. This protein is a component of the transforming growth factor beta receptor complex and it binds to the beta1 and beta3 peptides with high affinity. Mutations in this gene cause hereditary hemorrhagic telangiectasia, also known as Osler-Rendu-Weber syndrome 1, an autosomal dominant multisystemic vascular dysplasia. This gene may also be involved in preeclampsia and several types of cancer. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ENSG00000106991 | endoglin |
| GSTP1 | 2950 | Glutathione S-transferases (GSTs) are a family of enzymes that play an important role in detoxification by catalyzing the conjugation of many hydrophobic and electrophilic compounds with reduced glutathione. Based on their biochemical, immunologic, and structural properties, the soluble GSTs are categorized into 4 main classes: alpha, mu, pi, and theta. This GST family member is a polymorphic gene encoding active, functionally different GSTP1 variant proteins that are thought to function in xenobiotic metabolism and play a role in susceptibility to cancer, and other diseases. | ENSG00000084207 | glutathione S-transferase pi 1 |
| ENTPD1 | 953 | The protein encoded by this gene is a plasma membrane protein that hydrolyzes extracellular ATP and ADP to AMP. Inhibition of this protein’s activity may confer anticancer benefits. Several transcript variants encoding different isoforms have been found for this gene. | ENSG00000138185 | ectonucleoside triphosphate diphosphohydrolase 1 |
| STAG3L5P-PVRIG2P-PILRB | 101752399 | This locus represents naturally occurring readthrough transcription among the neighboring LOC101735302 (stromal antigen 3 pseudogene), LOC101752334 (poliovirus receptor related immunoglobulin domain containing pseudogene) and PILRB (paired immunoglobin-like type 2 receptor beta) genes on chromosome 7. The readthrough transcript is a candidate for nonsense-mediated mRNA decay (NMD), and is unlikely to produce a protein product. | ENSG00000272752 | STAG3L5P-PVRIG2P-PILRB readthrough |
| GPX3 | 2878 | This gene product belongs to the glutathione peroxidase family, which functions in the detoxification of hydrogen peroxide. It contains a selenocysteine (Sec) residue at its active site. The selenocysteine is encoded by the UGA codon, which normally signals translation termination. The 3’ UTR of Sec-containing genes have a common stem-loop structure, the sec insertion sequence (SECIS), which is necessary for the recognition of UGA as a Sec codon rather than as a stop signal. | ENSG00000211445 | glutathione peroxidase 3 |
| MX1 | 4599 | This gene encodes a guanosine triphosphate (GTP)-metabolizing protein that participates in the cellular antiviral response. The encoded protein is induced by type I and type II interferons and antagonizes the replication process of several different RNA and DNA viruses. There is a related gene located adjacent to this gene on chromosome 21, and there are multiple pseudogenes located in a cluster on chromosome 4. Alternative splicing results in multiple transcript variants. | ENSG00000157601 | MX dynamin like GTPase 1 |
| FURIN | 5045 | This gene encodes a member of the subtilisin-like proprotein convertase family, which includes proteases that process protein and peptide precursors trafficking through regulated or constitutive branches of the secretory pathway. It encodes a type 1 membrane bound protease that is expressed in many tissues, including neuroendocrine, liver, gut, and brain. The encoded protein undergoes an initial autocatalytic processing event in the ER and then sorts to the trans-Golgi network through endosomes where a second autocatalytic event takes place and the catalytic activity is acquired. The product of this gene is one of the seven basic amino acid-specific members which cleave their substrates at single or paired basic residues. Some of its substrates include proparathyroid hormone, transforming growth factor beta 1 precursor, proalbumin, pro-beta-secretase, membrane type-1 matrix metalloproteinase, beta subunit of pro-nerve growth factor and von Willebrand factor. It is also thought to be one of the proteases responsible for the activation of HIV envelope glycoproteins gp160 and gp140 and may play a role in tumor progression. This gene is located in close proximity to family member proprotein convertase subtilisin/kexin type 6 and upstream of the FES oncogene. Alternative splicing results in multiple transcript variants. | ENSG00000140564 | furin, paired basic amino acid cleaving enzyme |
| TNNC2 | 7125 | Troponin (Tn), a key protein complex in the regulation of striated muscle contraction, is composed of 3 subunits. The Tn-I subunit inhibits actomyosin ATPase, the Tn-T subunit binds tropomyosin and Tn-C, while the Tn-C subunit binds calcium and overcomes the inhibitory action of the troponin complex on actin filaments. The protein encoded by this gene is the Tn-C subunit. | ENSG00000101470 | troponin C2, fast skeletal type |
| MYL1 | 4632 | Myosin is a hexameric ATPase cellular motor protein. It is composed of two heavy chains, two nonphosphorylatable alkali light chains, and two phosphorylatable regulatory light chains. This gene encodes a myosin alkali light chain expressed in fast skeletal muscle. Two transcript variants have been identified for this gene. | ENSG00000168530 | myosin light chain 1 |
| PDLIM3 | 27295 | The protein encoded by this gene contains a PDZ domain and a LIM domain, indicating that it may be involved in cytoskeletal assembly. In support of this, the encoded protein has been shown to bind the spectrin-like repeats of alpha-actinin-2 and to colocalize with alpha-actinin-2 at the Z lines of skeletal muscle. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. Aberrant alternative splicing of this gene may play a role in myotonic dystrophy. | ENSG00000154553 | PDZ and LIM domain 3 |
| DCXR | 51181 | The protein encoded by this gene acts as a homotetramer to catalyze diacetyl reductase and L-xylulose reductase reactions. The encoded protein may play a role in the uronate cycle of glucose metabolism and in the cellular osmoregulation in the proximal renal tubules. Defects in this gene are a cause of pentosuria. Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000169738 | dicarbonyl/L-xylulose reductase |
| PBXIP1 | 57326 | The protein encoded by this gene interacts with the PBX1 homeodomain protein, inhibiting its transcriptional activation potential by preventing its binding to DNA. The encoded protein, which is primarily cytosolic but can shuttle to the nucleus, also can interact with estrogen receptors alpha and beta and promote the proliferation of breast cancer, brain tumors, and lung cancer. Several transcript variants encoding different isoforms have been found for this gene. More variants exist, but their full-length natures have yet to be determined. | ENSG00000163346 | PBX homeobox interacting protein 1 |
| CEBPA | 1050 | This intronless gene encodes a transcription factor that contains a basic leucine zipper (bZIP) domain and recognizes the CCAAT motif in the promoters of target genes. The encoded protein functions in homodimers and also heterodimers with CCAAT/enhancer-binding proteins beta and gamma. Activity of this protein can modulate the expression of genes involved in cell cycle regulation as well as in body weight homeostasis. Mutation of this gene is associated with acute myeloid leukemia. The use of alternative in-frame non-AUG (GUG) and AUG start codons results in protein isoforms with different lengths. Differential translation initiation is mediated by an out-of-frame, upstream open reading frame which is located between the GUG and the first AUG start codons. | ENSG00000245848 | CCAAT/enhancer binding protein alpha |
| LOC100129518 | 100129518 | NA | ENSG00000112096 | uncharacterized LOC100129518 |
| SOD2 | 6648 | This gene is a member of the iron/manganese superoxide dismutase family. It encodes a mitochondrial protein that forms a homotetramer and binds one manganese ion per subunit. This protein binds to the superoxide byproducts of oxidative phosphorylation and converts them to hydrogen peroxide and diatomic oxygen. Mutations in this gene have been associated with idiopathic cardiomyopathy (IDC), premature aging, sporadic motor neuron disease, and cancer. Alternative splicing of this gene results in multiple transcript variants. A related pseudogene has been identified on chromosome 1. | ENSG00000112096 | superoxide dismutase 2, mitochondrial |
| CRYAB | 1410 | Mammalian lens crystallins are divided into alpha, beta, and gamma families. Alpha crystallins are composed of two gene products: alpha-A and alpha-B, for acidic and basic, respectively. Alpha crystallins can be induced by heat shock and are members of the small heat shock protein (HSP20) family. They act as molecular chaperones although they do not renature proteins and release them in the fashion of a true chaperone; instead they hold them in large soluble aggregates. Post-translational modifications decrease the ability to chaperone. These heterogeneous aggregates consist of 30-40 subunits; the alpha-A and alpha-B subunits have a 3:1 ratio, respectively. Two additional functions of alpha crystallins are an autokinase activity and participation in the intracellular architecture. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. Alpha-A and alpha-B gene products are differentially expressed; alpha-A is preferentially restricted to the lens and alpha-B is expressed widely in many tissues and organs. Elevated expression of alpha-B crystallin occurs in many neurological diseases; a missense mutation cosegregated in a family with a desmin-related myopathy. Alternative splicing results in multiple transcript variants. | ENSG00000109846 | crystallin alpha B |
| FASN | 2194 | The enzyme encoded by this gene is a multifunctional protein. Its main function is to catalyze the synthesis of palmitate from acetyl-CoA and malonyl-CoA, in the presence of NADPH, into long-chain saturated fatty acids. In some cancer cell lines, this protein has been found to be fused with estrogen receptor-alpha (ER-alpha), in which the N-terminus of FAS is fused in-frame with the C-terminus of ER-alpha. | ENSG00000169710 | fatty acid synthase |
| CXCL2 | 2920 | This antimicrobial gene is part of a chemokine superfamily that encodes secreted proteins involved in immunoregulatory and inflammatory processes. The superfamily is divided into four subfamilies based on the arrangement of the N-terminal cysteine residues of the mature peptide. This chemokine, a member of the CXC subfamily, is expressed at sites of inflammation and may suppress hematopoietic progenitor cell proliferation. | ENSG00000081041 | C-X-C motif chemokine ligand 2 |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",4,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[5,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| summary | X_id | symbol | name | query | notfound |
|---|---|---|---|---|---|
| This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum. It has both 17alpha-hydroxylase and 17,20-lyase activities and is a key enzyme in the steroidogenic pathway that produces progestins, mineralocorticoids, glucocorticoids, androgens, and estrogens. Mutations in this gene are associated with isolated steroid-17 alpha-hydroxylase deficiency, 17-alpha-hydroxylase/17,20-lyase deficiency, pseudohermaphroditism, and adrenal hyperplasia. | 1586 | CYP17A1 | cytochrome P450 family 17 subfamily A member 1 | ENSG00000148795 | NA |
| This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the mitochondrial inner membrane and is involved in the conversion of progesterone to cortisol in the adrenal cortex. Mutations in this gene cause congenital adrenal hyperplasia due to 11-beta-hydroxylase deficiency. Transcript variants encoding different isoforms have been noted for this gene. | 1584 | CYP11B1 | cytochrome P450 family 11 subfamily B member 1 | ENSG00000160882 | NA |
| NA | ENSG00000211895 | IGHA1 | immunoglobulin heavy constant alpha 1 | ENSG00000211895 | NA |
| The protein encoded by this gene plays a key role in the acute regulation of steroid hormone synthesis by enhancing the conversion of cholesterol into pregnenolone. This protein permits the cleavage of cholesterol into pregnenolone by mediating the transport of cholesterol from the outer mitochondrial membrane to the inner mitochondrial membrane. Mutations in this gene are a cause of congenital lipoid adrenal hyperplasia (CLAH), also called lipoid CAH. A pseudogene of this gene is located on chromosome 13. | 6770 | STAR | steroidogenic acute regulatory protein | ENSG00000147465 | NA |
| Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | 72 | ACTG2 | actin, gamma 2, smooth muscle, enteric | ENSG00000163017 | NA |
| The protein encoded by this gene is a major apoprotein of the chylomicron. It binds to a specific liver and peripheral cell receptor, and is essential for the normal catabolism of triglyceride-rich lipoprotein constituents. This gene maps to chromosome 19 in a cluster with the related apolipoprotein C1 and C2 genes. Mutations in this gene result in familial dysbetalipoproteinemia, or type III hyperlipoproteinemia (HLP III), in which increased plasma cholesterol and triglycerides are the consequence of impaired clearance of chylomicron and VLDL remnants. Alternative splicing results in multiple transcript variants. | 348 | APOE | apolipoprotein E | ENSG00000130203 | NA |
| This gene encodes a flavin adenine dinucleotide (FAD)-dependent oxidoreductase which catalyzes the reduction of the delta-24 double bond of sterol intermediates during cholesterol biosynthesis. The protein contains a leader sequence that directs it to the endoplasmic reticulum membrane. Missense mutations in this gene have been associated with desmosterolosis. Also, reduced expression of the gene occurs in the temporal cortex of Alzheimer disease patients and overexpression has been observed in adrenal gland cancer cells. | 1718 | DHCR24 | 24-dehydrocholesterol reductase | ENSG00000116133 | NA |
| Protamines substitute for histones in the chromatin of sperm during the haploid phase of spermatogenesis, and are the major DNA-binding proteins in the nucleus of sperm in many vertebrates. They package the sperm DNA into a highly condensed complex in a volume less than 5% of a somatic cell nucleus. Many mammalian species have only one protamine (protamine 1); however, a few species, including human and mouse, have two. This gene encodes protamine 2, which is cleaved to give rise to a family of protamine 2 peptides. Alternatively spliced transcript variants have also been found for this gene. | 5620 | PRM2 | protamine 2 | ENSG00000122304 | NA |
| This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the mitochondrial inner membrane and catalyzes the conversion of cholesterol to pregnenolone, the first and rate-limiting step in the synthesis of the steroid hormones. Two transcript variants encoding different isoforms have been found for this gene. The cellular location of the smaller isoform is unclear since it lacks the mitochondrial-targeting transit peptide. | 1583 | CYP11A1 | cytochrome P450 family 11 subfamily A member 1 | ENSG00000140459 | NA |
| This gene is a member of the tropomyosin family of highly conserved, widely distributed actin-binding proteins involved in the contractile system of striated and smooth muscles and the cytoskeleton of non-muscle cells. Tropomyosin is composed of two alpha-helical chains arranged as a coiled-coil. It is polymerized end to end along the two grooves of actin filaments and provides stability to the filaments. The encoded protein is one type of alpha helical chain that forms the predominant tropomyosin of striated muscle, where it also functions in association with the troponin complex to regulate the calcium-dependent interaction of actin and myosin during muscle contraction. In smooth muscle and non-muscle cells, alternatively spliced transcript variants encoding a range of isoforms have been described. Mutations in this gene are associated with type 3 familial hypertrophic cardiomyopathy. | 7168 | TPM1 | tropomyosin 1 (alpha) | ENSG00000140416 | NA |
| The Fos gene family consists of 4 members: FOS, FOSB, FOSL1, and FOSL2. These genes encode leucine zipper proteins that can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1. As such, the FOS proteins have been implicated as regulators of cell proliferation, differentiation, and transformation. | 2355 | FOSL2 | FOS like 2, AP-1 transcription factor subunit | ENSG00000075426 | NA |
| This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | 60 | ACTB | actin, beta | ENSG00000075624 | NA |
| The protein encoded by this gene is a plasma membrane receptor for high density lipoprotein cholesterol (HDL). The encoded protein mediates cholesterol transfer to and from HDL. In addition, this protein is a receptor for hepatitis C virus glycoprotein E2. Two transcript variants encoding different isoforms have been found for this gene. | 949 | SCARB1 | scavenger receptor class B member 1 | ENSG00000073060 | NA |
| The protein encoded by this gene is an isozyme of the long-chain fatty-acid-coenzyme A ligase family. Although differing in substrate specificity, subcellular localization, and tissue distribution, all isozymes of this family convert free long-chain fatty acids into fatty acyl-CoA esters, and thereby play a key role in lipid biosynthesis and fatty acid degradation. Several transcript variants encoding different isoforms have been found for this gene. | 2180 | ACSL1 | acyl-CoA synthetase long-chain family member 1 | ENSG00000151726 | NA |
| This gene represents a ubiquitin gene, ubiquitin C. The encoded protein is a polyubiquitin precursor. Conjugation of ubiquitin monomers or polymers can lead to various effects within a cell, depending on the residues to which ubiquitin is conjugated. Ubiquitination has been associated with protein degradation, DNA repair, cell cycle regulation, kinase modification, endocytosis, and regulation of other cell signaling pathways. | 7316 | UBC | ubiquitin C | ENSG00000150991 | NA |
| The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in brain as well as in other tissues, and as a heterodimer with a similar muscle isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. A pseudogene of this gene has been characterized. | 1152 | CKB | creatine kinase B | ENSG00000166165 | NA |
| This gene encodes a preproprotein, which is processed to yield both alpha and beta chains, which subsequently combine as a tetramer to produce haptoglobin. Haptoglobin functions to bind free plasma hemoglobin, which allows degradative enzymes to gain access to the hemoglobin, while at the same time preventing loss of iron through the kidneys and protecting the kidneys from damage by hemoglobin. Mutations in this gene and/or its regulatory regions cause ahaptoglobinemia or hypohaptoglobinemia. This gene has also been linked to diabetic nephropathy, the incidence of coronary artery disease in type 1 diabetes, Crohn’s disease, inflammatory disease behavior, primary sclerosing cholangitis, susceptibility to idiopathic Parkinson’s disease, and a reduced incidence of Plasmodium falciparum malaria. The protein encoded also exhibits antimicrobial activity against bacteria. A similar duplicated gene is located next to this gene on chromosome 16. Multiple transcript variants encoding different isoforms have been found for this gene. | 3240 | HP | haptoglobin | ENSG00000257017 | NA |
| The expression of DUSP1 gene is induced in human skin fibroblasts by oxidative/heat stress and growth factors. It specifies a protein with structural features similar to members of the non-receptor-type protein-tyrosine phosphatase family, and which has significant amino-acid sequence similarity to a Tyr/Ser-protein phosphatase encoded by the late gene H1 of vaccinia virus. The bacterially expressed and purified DUSP1 protein has intrinsic phosphatase activity, and specifically inactivates mitogen-activated protein (MAP) kinase in vitro by the concomitant dephosphorylation of both its phosphothreonine and phosphotyrosine residues. Furthermore, it suppresses the activation of MAP kinase by oncogenic ras in extracts of Xenopus oocytes. Thus, DUSP1 may play an important role in the human cellular response to environmental stress as well as in the negative regulation of cellular proliferation. | 1843 | DUSP1 | dual specificity phosphatase 1 | ENSG00000120129 | NA |
| Aldehyde oxidase produces hydrogen peroxide and, under certain conditions, can catalyze the formation of superoxide. Aldehyde oxidase is a candidate gene for amyotrophic lateral sclerosis. | 316 | AOX1 | aldehyde oxidase 1 | ENSG00000138356 | NA |
| NA | 5619 | PRM1 | protamine 1 | ENSG00000175646 | NA |
| The protein encoded by this gene is a transformation and shape-change sensitive actin cross-linking/gelling protein found in fibroblasts and smooth muscle. Its expression is down-regulated in many cell lines, and this down-regulation may be an early and sensitive marker for the onset of transformation. A functional role of this protein is unclear. Two transcript variants encoding the same protein have been found for this gene. | 6876 | TAGLN | transgelin | ENSG00000149591 | NA |
| The protein encoded by this gene is a component of desmosomes and of the epidermal cornified envelope in keratinocytes. The N-terminal domain of this protein interacts with the plasma membrane and its C-terminus interacts with intermediate filaments. Through its rod domain, this protein forms complexes with envoplakin. This protein may serve as a link between the cornified envelope and desmosomes as well as intermediate filaments. AKT1/PKB, a protein kinase mediating a variety of cell growth and survival signaling processes, is reported to interact with this protein, suggesting a possible role for this protein as a localization signal in AKT1-mediated signaling. | 5493 | PPL | periplakin | ENSG00000118898 | NA |
| This gene encodes a member of the profilin family of small actin-binding proteins. The encoded protein plays an important role in actin dynamics by regulating actin polymerization in response to extracellular signals. Deletion of this gene is associated with Miller-Dieker syndrome, and the encoded protein may also play a role in Huntington disease. Multiple pseudogenes of this gene are located on chromosome 1. | 5216 | PFN1 | profilin 1 | ENSG00000108518 | NA |
| This gene is a member of the N-myc downregulated gene family which belongs to the alpha/beta hydrolase superfamily. The protein encoded by this gene is a cytoplasmic protein involved in stress responses, hormone responses, cell growth, and differentiation. The encoded protein is necessary for p53-mediated caspase activation and apoptosis. Mutations in this gene are a cause of Charcot-Marie-Tooth disease type 4D, and expression of this gene may be a prognostic indicator for several types of cancer. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 10397 | NDRG1 | N-myc downstream regulated 1 | ENSG00000104419 | NA |
| The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the basal layer of the epidermis with family member KRT14. Mutations in these genes have been associated with a complex of diseases termed epidermolysis bullosa simplex. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3852 | KRT5 | keratin 5 | ENSG00000186081 | NA |
| Epoxide hydrolase is a critical biotransformation enzyme that converts epoxides from the degradation of aromatic compounds to trans-dihydrodiols which can be conjugated and excreted from the body. Epoxide hydrolase functions in both the activation and detoxification of epoxides. Mutations in this gene cause preeclampsia, epoxide hydrolase deficiency or increased epoxide hydrolase activity. Alternatively spliced transcript variants encoding the same protein have been found for this gene. | 2052 | EPHX1 | epoxide hydrolase 1 | ENSG00000143819 | NA |
| NA | ENSG00000211899 | IGHM | immunoglobulin heavy constant mu | ENSG00000211899 | NA |
| This gene encodes a multidomain serine protease inhibitor that contains 15 potential inhibitory domains. The encoded preproprotein is proteolytically processed to generate multiple protein products, which may exhibit unique activities and specificities. These proteins may play a role in skin and hair morphogenesis, as well as anti-inflammatory and antimicrobial protection of mucous epithelia. Mutations in this gene may result in Netherton syndrome, a disorder characterized by ichthyosis, defective cornification, and atopy. This gene is present in a gene cluster on chromosome 5. Alternative splicing results in multiple transcript variants. | 11005 | SPINK5 | serine peptidase inhibitor, Kazal type 5 | ENSG00000133710 | NA |
| The protein encoded by this gene is a cell membrane protein that may be involved in iron export from duodenal epithelial cells. Defects in this gene are a cause of hemochromatosis type 4 (HFE4). | 30061 | SLC40A1 | solute carrier family 40 member 1 | ENSG00000138449 | NA |
| NA | ENSG00000211890 | IGHA2 | immunoglobulin heavy constant alpha 2 (A2m marker) | ENSG00000211890 | NA |
| The Fos gene family consists of 4 members: FOS, FOSB, FOSL1, and FOSL2. These genes encode leucine zipper proteins that can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1. As such, the FOS proteins have been implicated as regulators of cell proliferation, differentiation, and transformation. In some cases, expression of the FOS gene has also been associated with apoptotic cell death. | 2353 | FOS | Fos proto-oncogene, AP-1 transcription factor subunit | ENSG00000170345 | NA |
| Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | 4625 | MYH7 | myosin, heavy chain 7, cardiac muscle, beta | ENSG00000092054 | NA |
| The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains and are clustered in a region on chromosome 17q21.2. | 3866 | KRT15 | keratin 15 | ENSG00000171346 | NA |
| Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Many of the effects of laminin are mediated through interactions with cell surface receptors. These receptors include members of the integrin family, as well as non-integrin laminin-binding proteins. This gene encodes a high-affinity, non-integrin family, laminin receptor 1. This receptor has been variously called 67 kD laminin receptor, 37 kD laminin receptor precursor (37LRP) and p40 ribosome-associated protein. The amino acid sequence of laminin receptor 1 is highly conserved through evolution, suggesting a key biological function. It has been observed that the level of the laminin receptor transcript is higher in colon carcinoma tissue and lung cancer cell line than their normal counterparts. Also, there is a correlation between the upregulation of this polypeptide in cancer cells and their invasive and metastatic phenotype. Multiple copies of this gene exist, however, most of them are pseudogenes thought to have arisen from retropositional events. Two alternatively spliced transcript variants encoding the same protein have been found for this gene. | 3921 | RPSA | ribosomal protein SA | ENSG00000168028 | NA |
| This gene is a member of the immunoglobulin superfamily. The encoded poly-Ig receptor binds polymeric immunoglobulin molecules at the basolateral surface of epithelial cells; the complex is then transported across the cell to be secreted at the apical surface. A significant association was found between immunoglobulin A nephropathy and several SNPs in this gene. | 5284 | PIGR | polymeric immunoglobulin receptor | ENSG00000162896 | NA |
| This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | 1674 | DES | desmin | ENSG00000175084 | NA |
| The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and as a cytokine. Altered expression of this protein is associated with the disease cystic fibrosis. Multiple transcript variants encoding different isoforms have been found for this gene. | 6279 | S100A8 | S100 calcium binding protein A8 | ENSG00000143546 | NA |
| This gene encodes a glycoprotein involved in the regulation of the complement cascade. Binding of the encoded protein to complement proteins accelerates their decay, thereby disrupting the cascade and preventing damage to host cells. Antigens present on this protein constitute the Cromer blood group system (CROM). Alternative splicing results in multiple transcript variants. The predominant transcript variant encodes a membrane-bound protein, but alternatively spliced transcripts may produce soluble proteins. | 1604 | CD55 | CD55 molecule (Cromer blood group) | ENSG00000196352 | NA |
| The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | 4629 | MYH11 | myosin, heavy chain 11, smooth muscle | ENSG00000133392 | NA |
| This gene encodes a transmembrane protein that contains multiple epidermal growth factor repeats that functions as a regulator of cell growth. The encoded protein is involved in the differentiation of several cell types including adipocytes. This gene is located in a region of chromosome 14 frequently showing unparental disomy, and is imprinted and expressed from the paternal allele. A single nucleotide variant in this gene is associated with child and adolescent obesity and shows polar overdominance, where heterozygotes carrying an active paternal allele express the phenotype, while mutant homozygotes are normal. | 8788 | DLK1 | delta like non-canonical Notch ligand 1 | ENSG00000185559 | NA |
| The protein encoded by this gene is a glycogen phosphorylase found predominantly in the brain. The encoded protein forms homodimers which can associate into homotetramers, the enzymatically active form of glycogen phosphorylase. The activity of this enzyme is positively regulated by AMP and negatively regulated by ATP, ADP, and glucose-6-phosphate. This enzyme catalyzes the rate-determining step in glycogen degradation. | 5834 | PYGB | phosphorylase, glycogen; brain | ENSG00000100994 | NA |
| This gene encodes a member of the small leucine-rich proteoglycan family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature protein. This protein plays a role in collagen fibril assembly. Binding of this protein to multiple cell surface receptors mediates its role in tumor suppression, including a stimulatory effect on autophagy and inflammation and an inhibitory effect on angiogenesis and tumorigenesis. This gene and the related gene biglycan are thought to be the result of a gene duplication. Mutations in this gene are associated with congenital stromal corneal dystrophy in human patients. | 1634 | DCN | decorin | ENSG00000011465 | NA |
| This gene is a member of the TIS11 family of early response genes, which are induced by various agonists such as the phorbol ester TPA and the polypeptide mitogen EGF. This gene is well conserved across species and has a promoter that contains motifs seen in other early-response genes. The encoded protein contains a distinguishing putative zinc finger domain with a repeating cys-his motif. This putative nuclear transcription factor most likely functions in regulating the response to growth factors. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 677 | ZFP36L1 | ZFP36 ring finger protein-like 1 | ENSG00000185650 | NA |
| This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum and hydroxylates steroids at the 21 position. Its activity is required for the synthesis of steroid hormones including cortisol and aldosterone. Mutations in this gene cause congenital adrenal hyperplasia. A related pseudogene is located near this gene; gene conversion events involving the functional gene and the pseudogene are thought to account for many cases of steroid 21-hydroxylase deficiency. Two transcript variants encoding different isoforms have been found for this gene. | 1589 | CYP21A2 | cytochrome P450 family 21 subfamily A member 2 | ENSG00000231852 | NA |
| Spermatogenesis is a complex process regulated by extracellular and intracellular factors as well as cellular interactions among interstitial cells of the testis, Sertoli cells, and germ cells. This gene is expressed in the testis in Sertoli cells but not germ cells. The protein encoded by this gene contains plant homeodomain (PHD) finger domains, also known as leukemia associated protein (LAP) domains, believed to be involved in transcriptional regulation. The protein, which localizes to the nucleus of transfected cells, has been implicated in the transcriptional regulation of spermatogenesis. Alternate splicing results in multiple transcript variants of this gene. | 51533 | PHF7 | PHD finger protein 7 | ENSG00000010318 | NA |
| NA | 57515 | SERINC1 | serine incorporator 1 | ENSG00000111897 | NA |
| This gene encodes a member of the aldo/keto reductase superfamily, which consists of more than 40 known enzymes and proteins. This member catalyzes the reduction of a number of aldehydes, including the aldehyde form of glucose, and is thereby implicated in the development of diabetic complications by catalyzing the reduction of glucose to sorbitol. Multiple pseudogenes have been identified for this gene. The nomenclature system used by the HUGO Gene Nomenclature Committee to define human aldo-keto reductase family members is known to differ from that used by the Mouse Genome Informatics database. | 231 | AKR1B1 | aldo-keto reductase family 1 member B | ENSG00000085662 | NA |
| NA | 55074 | OXR1 | oxidation resistance 1 | ENSG00000164830 | NA |
| NA | 84669 | USP32 | ubiquitin specific peptidase 32 | ENSG00000170832 | NA |
| This gene encodes a member of the Notch family. Members of this Type 1 transmembrane protein family share structural characteristics including an extracellular domain consisting of multiple epidermal growth factor-like (EGF) repeats, and an intracellular domain consisting of multiple, different domain types. Notch family members play a role in a variety of developmental processes by controlling cell fate decisions. The Notch signaling network is an evolutionarily conserved intercellular signaling pathway which regulates interactions between physically adjacent cells. In Drosophilia, notch interaction with its cell-bound ligands (delta, serrate) establishes an intercellular signaling pathway that plays a key role in development. Homologues of the notch-ligands have also been identified in human, but precise interactions between these ligands and the human notch homologues remain to be determined. This protein is cleaved in the trans-Golgi network, and presented on the cell surface as a heterodimer. This protein functions as a receptor for membrane bound ligands, and may play a role in vascular, renal and hepatic development. Two transcript variants encoding different isoforms have been found for this gene. | 4853 | NOTCH2 | notch 2 | ENSG00000134250 | NA |
| NA | 26959 | HBP1 | HMG-box transcription factor 1 | ENSG00000105856 | NA |
| NA | 3488 | IGFBP5 | insulin like growth factor binding protein 5 | ENSG00000115461 | NA |
| This gene encodes a serine protease, which is a major constituent of the human complement subcomponent C1. C1s associates with two other complement components C1r and C1q in order to yield the first component of the serum complement system. Defects in this gene are the cause of selective C1s deficiency. | 716 | C1S | complement component 1, s subcomponent | ENSG00000182326 | NA |
| NA | 7763 | ZFAND5 | zinc finger AN1-type containing 5 | ENSG00000107372 | NA |
| Radixin is a cytoskeletal protein that may be important in linking actin to the plasma membrane. It is highly similar in sequence to both ezrin and moesin. The radixin gene has been localized by fluorescence in situ hybridization to 11q23. A truncated version representing a pseudogene (RDXP2) was assigned to Xp21.3. Another pseudogene that seemed to lack introns (RDXP1) was mapped to 11p by Southern and PCR analyses. Multiple alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 5962 | RDX | radixin | ENSG00000137710 | NA |
| Synaptic vesicles are responsible for regulating the storage and release of neurotransmitters in the nerve terminal. The protein encoded by this gene is an abundant integral membrane protein of cholinergic synaptic vesicles and is thought to be involved in vesicular transport. It belongs to the quinone oxidoreductase subfamily of zinc-containing alcohol dehydrogenase proteins. | 10493 | VAT1 | vesicle amine transport 1 | ENSG00000108828 | NA |
| The protein encoded by this gene is a member of the alcohol dehydrogenase family. Members of this enzyme family metabolize a wide variety of substrates, including ethanol, retinol, other aliphatic alcohols, hydroxysteroids, and lipid peroxidation products. This encoded protein, consisting of several homo- and heterodimers of alpha, beta, and gamma subunits, exhibits high activity for ethanol oxidation and plays a major role in ethanol catabolism. Three genes encoding alpha, beta and gamma subunits are tandemly organized in a genomic segment as a gene cluster. Two transcript variants encoding different isoforms have been found for this gene. | 125 | ADH1B | alcohol dehydrogenase 1B (class I), beta polypeptide | ENSG00000196616 | NA |
| NA | 51313 | FAM198B | family with sequence similarity 198 member B | ENSG00000164125 | NA |
| NA | 9467 | SH3BP5 | SH3 domain binding protein 5 | ENSG00000131370 | NA |
| This gene encodes a member of the peptidyl-prolyl cis-trans isomerase (PPIase) family. PPIases catalyze the cis-trans isomerization of proline imidic peptide bonds in oligopeptides and accelerate the folding of proteins. The encoded protein is a cyclosporin binding-protein and may play a role in cyclosporin A-mediated immunosuppression. The protein can also interact with several HIV proteins, including p55 gag, Vpr, and capsid protein, and has been shown to be necessary for the formation of infectious HIV virions. Multiple pseudogenes that map to different chromosomes have been reported. | 5478 | PPIA | peptidylprolyl isomerase A | ENSG00000196262 | NA |
| The protein encoded by this gene is a transcriptional regulator that binds as a homodimer to activating transcription factor (ATF) sites in many cellular and viral promoters. The encoded protein represses PER1 and PER2 expression and therefore plays a role in the regulation of circadian rhythm. Three transcript variants encoding the same protein have been found for this gene. | 4783 | NFIL3 | nuclear factor, interleukin 3 regulated | ENSG00000165030 | NA |
| This gene encodes one of the SERCA Ca(2+)-ATPases, which are intracellular pumps located in the sarcoplasmic or endoplasmic reticula of muscle cells. This enzyme catalyzes the hydrolysis of ATP coupled with the translocation of calcium from the cytosol to the sarcoplasmic reticulum lumen, and is involved in calcium sequestration associated with muscular excitation and contraction. Alternative splicing results in multiple transcript variants encoding different isoforms. | 489 | ATP2A3 | ATPase sarcoplasmic/endoplasmic reticulum Ca2+ transporting 3 | ENSG00000074370 | NA |
| This gene encodes a member of the cysteine-rich protein (CSRP) family. This gene family includes a group of LIM domain proteins, which may be involved in regulatory processes important for development and cellular differentiation. The LIM/double zinc-finger motif found in this gene product occurs in proteins with critical functions in gene regulation, cell growth, and somatic differentiation. Alternatively spliced transcript variants have been described. | 1465 | CSRP1 | cysteine and glycine rich protein 1 | ENSG00000159176 | NA |
| This gene encodes a lysine-specific histone demethylase that belongs to the jumonji/ARID domain-containing family of histone demethylases. The encoded protein is capable of demethylating tri-, di- and monomethylated lysine 4 of histone H3. This protein plays a role in the transcriptional repression or certain tumor suppressor genes and is upregulated in certain cancer cells. This protein may also play a role in genome stability and DNA repair. Alternate splicing resultsi n multiple transcript variants. | 10765 | KDM5B | lysine demethylase 5B | ENSG00000117139 | NA |
| NA | 57561 | ARRDC3 | arrestin domain containing 3 | ENSG00000113369 | NA |
| Cytochrome c oxidase (COX), the terminal component of the mitochondrial respiratory chain, catalyzes the electron transfer from reduced cytochrome c to oxygen. This component is a heteromeric complex consisting of 3 catalytic subunits encoded by mitochondrial genes and multiple structural subunits encoded by nuclear genes. The mitochondrially-encoded subunits function in electron transfer, and the nuclear-encoded subunits may function in the regulation and assembly of the complex. This nuclear gene encodes subunit VIIc, which shares 87% and 85% amino acid sequence identity with mouse and bovine COX VIIc, respectively, and is found in all tissues. A pseudogene COX7CP1 has been found on chromosome 13. | 1350 | COX7C | cytochrome c oxidase subunit 7C | ENSG00000127184 | NA |
| This gene encodes an integral membrane protein containing four transmembrane regions and a C-terminal cytoplasmic tail that is tyrosine phosphorylated. The exact function of this protein is unclear, but studies of a similar rat protein suggest that it may play a role in regulating membrane traffic in non-neuronal cells. The gene belongs to the synaptogyrin gene family. Alternative splicing results in multiple transcript variants. | 9144 | SYNGR2 | synaptogyrin 2 | ENSG00000108639 | NA |
| NA | 6515 | SLC2A3 | solute carrier family 2 member 3 | ENSG00000059804 | NA |
| NA | 84898 | PLXDC2 | plexin domain containing 2 | ENSG00000120594 | NA |
| This gene encodes a transcription factor that binds to the sterol regulatory element-1 (SRE1), which is a decamer flanking the low density lipoprotein receptor gene and some genes involved in sterol biosynthesis. The protein is synthesized as a precursor that is attached to the nuclear membrane and endoplasmic reticulum. Following cleavage, the mature protein translocates to the nucleus and activates transcription by binding to the SRE1. Sterols inhibit the cleavage of the precursor, and the mature nuclear form is rapidly catabolized, thereby reducing transcription. The protein is a member of the basic helix-loop-helix-leucine zipper (bHLH-Zip) transcription factor family. This gene is located within the Smith-Magenis syndrome region on chromosome 17. | 6720 | SREBF1 | sterol regulatory element binding transcription factor 1 | ENSG00000072310 | NA |
| This gene encodes a member of the Kruppel-like family of transcription factors. The zinc finger protein is a transcriptional activator, and functions as a tumor suppressor. Multiple transcript variants encoding different isoforms have been found for this gene, some of which are implicated in carcinogenesis. | 1316 | KLF6 | Kruppel like factor 6 | ENSG00000067082 | NA |
| NA | 58191 | CXCL16 | C-X-C motif chemokine ligand 16 | ENSG00000161921 | NA |
| Cytochrome c oxidase (COX) is the terminal enzyme of the mitochondrial respiratory chain. It is a multi-subunit enzyme complex that couples the transfer of electrons from cytochrome c to molecular oxygen and contributes to a proton electrochemical gradient across the inner mitochondrial membrane. The complex consists of 13 mitochondrial- and nuclear-encoded subunits. The mitochondrially-encoded subunits perform the electron transfer and proton pumping activities. The functions of the nuclear-encoded subunits are unknown but they may play a role in the regulation and assembly of the complex. This gene encodes the nuclear-encoded subunit IV isoform 1 of the human mitochondrial respiratory chain enzyme. It is located at the 3’ of the NOC4 (neighbor of COX4) gene in a head-to-head orientation, and shares a promoter with it. Pseudogenes related to this gene are located on chromosomes 13 and 14. Alternative splicing results in multiple transcript variants encoding different isoforms. | 1327 | COX4I1 | cytochrome c oxidase subunit 4I1 | ENSG00000131143 | NA |
| Amino acid transporters play essential roles in the uptake of nutrients, production of energy, chemical metabolism, detoxification, and neurotransmitter cycling. SLC38A1 is an important transporter of glutamine, an intermediate in the detoxification of ammonia and the production of urea. Glutamine serves as a precursor for the synaptic transmitter, glutamate (Gu et al., 2001 [PubMed 11325958]). | 81539 | SLC38A1 | solute carrier family 38 member 1 | ENSG00000111371 | NA |
| This gene encodes an enzyme involved in fatty acid biosynthesis, primarily the synthesis of oleic acid. The protein belongs to the fatty acid desaturase family and is an integral membrane protein located in the endoplasmic reticulum. Transcripts of approximately 3.9 and 5.2 kb, differing only by alternative polyadenlyation signals, have been detected. A gene encoding a similar enzyme is located on chromosome 4 and a pseudogene of this gene is located on chromosome 17. | 6319 | SCD | stearoyl-CoA desaturase | ENSG00000099194 | NA |
| This gene encodes a membrane-bound protein that is a member of the mucin family. Mucins are O-glycosylated proteins that play an essential role in forming protective mucous barriers on epithelial surfaces. These proteins also play a role in intracellular signaling. This protein is expressed on the apical surface of epithelial cells that line the mucosal surfaces of many different tissues including lung, breast stomach and pancreas. This protein is proteolytically cleaved into alpha and beta subunits that form a heterodimeric complex. The N-terminal alpha subunit functions in cell-adhesion and the C-terminal beta subunit is involved in cell signaling. Overexpression, aberrant intracellular localization, and changes in glycosylation of this protein have been associated with carcinomas. This gene is known to contain a highly polymorphic variable number tandem repeats (VNTR) domain. Alternate splicing results in multiple transcript variants. | 4582 | MUC1 | mucin 1, cell surface associated | ENSG00000185499 | NA |
| This gene encodes a member of the DnaJ or Hsp40 (heat shock protein 40 kD) family of proteins. DNAJ family members are characterized by a highly conserved amino acid stretch called the ‘J-domain’ and function as one of the two major classes of molecular chaperones involved in a wide range of cellular events, such as protein folding and oligomeric protein complex assembly. The encoded protein is a molecular chaperone that stimulates the ATPase activity of Hsp70 heat-shock proteins in order to promote protein folding and prevent misfolded protein aggregation. Alternative splicing results in multiple transcript variants. | 3337 | DNAJB1 | DnaJ heat shock protein family (Hsp40) member B1 | ENSG00000132002 | NA |
| This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 7169 | TPM2 | tropomyosin 2 (beta) | ENSG00000198467 | NA |
| This gene belongs to the chemokine-like factor gene superfamily, a novel family that links the chemokine and the transmembrane 4 superfamilies of signaling molecules. The protein encoded by this gene may play an important role in testicular development. | 146225 | CMTM2 | CKLF like MARVEL transmembrane domain containing 2 | ENSG00000140932 | NA |
| This gene encodes a member of the myotubularin dual specificity protein phosphatase gene family. The encoded protein is structurally similar to myotubularin but in addition contains a FYVE domain and an N-terminal PH-GRAM domain. The protein can self-associate and also form heteromers with another myotubularin related protein. The protein binds to phosphoinositide lipids through the PH-GRAM domain, and can hydrolyze phosphatidylinositol(3)-phosphate and phosphatidylinositol(3,5)-biphosphate in vitro. The encoded protein has been observed to have a perinuclear, possibly membrane-bound, distribution in cells, but it has also been found free in the cytoplasm. Multiple transcript variants encoding different isoforms have been found for this gene. | 8897 | MTMR3 | myotubularin related protein 3 | ENSG00000100330 | NA |
| This gene encodes a member of the small leucine-rich proteoglycan (SLRP) family that includes decorin, biglycan, fibromodulin, keratocan, epiphycan, and osteoglycin. In these bifunctional molecules, the protein moiety binds collagen fibrils and the highly charged hydrophilic glycosaminoglycans regulate interfibrillar spacings. Lumican is the major keratan sulfate proteoglycan of the cornea but is also distributed in interstitial collagenous matrices throughout the body. Lumican may regulate collagen fibril organization and circumferential growth, corneal transparency, and epithelial cell migration and tissue repair. | 4060 | LUM | lumican | ENSG00000139329 | NA |
| NA | 92840 | REEP6 | receptor accessory protein 6 | ENSG00000115255 | NA |
| Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L15P family of ribosomal proteins. It is located in the cytoplasm. Variable expression of this gene in colorectal cancers compared to adjacent normal tissues has been observed, although no correlation between the level of expression and the severity of the disease has been found. As is typical for genes encoding ribosomal proteins, multiple processed pseudogenes derived from this gene are dispersed through the genome. | 6157 | RPL27A | ribosomal protein L27a | ENSG00000166441 | NA |
| The product of this gene is a membrane-associated protein that functions in clathrin-mediated endocytosis and protein trafficking within the cell. The encoded protein binds to the huntingtin protein in the brain; this interaction is lost in Huntington’s disease. Alternative splicing results in multiple transcript variants. | 3092 | HIP1 | huntingtin interacting protein 1 | ENSG00000127946 | NA |
| This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | 2335 | FN1 | fibronectin 1 | ENSG00000115414 | NA |
| This gene encodes a member of the sestrin family of stress-induced proteins. The encoded protein reduces the levels of intracellular reactive oxygen species induced by activated Ras downstream of RAC-alpha serine/threonine-protein kinase (Akt) and FoxO transcription factor. The protein is required for normal regulation of blood glucose, insulin resistance and plays a role in lipid storage in obesity. Alternative splicing results in multiple transcript variants. | 143686 | SESN3 | sestrin 3 | ENSG00000149212 | NA |
| NA | 7091 | TLE4 | transducin like enhancer of split 4 | ENSG00000106829 | NA |
| The protein encoded by this gene is a member of the Zfh1 family of 2-handed zinc finger/homeodomain proteins. It is located in the nucleus and functions as a DNA-binding transcriptional repressor that interacts with activated SMADs. Mutations in this gene are associated with Hirschsprung disease/Mowat-Wilson syndrome. Alternatively spliced transcript variants have been found for this gene. | 9839 | ZEB2 | zinc finger E-box binding homeobox 2 | ENSG00000169554 | NA |
| NA | ENSG00000211675 | IGLC1 | immunoglobulin lambda constant 1 (Mcg marker) | ENSG00000211675 | NA |
| NA | NA | NA | NA | ENSG00000090920 | TRUE |
| This gene encodes one of the immunoglobulin lambda-like polypeptides. It is located within the immunoglobulin lambda locus but it does not require somatic rearrangement for expression. The first exon of this gene is unrelated to immunoglobulin variable genes; the second and third exons are the immunoglobulin lambda joining 1 and the immunoglobulin lambda constant 1 gene segments. Alternative splicing results in multiple transcript variants. | 100423062 | IGLL5 | immunoglobulin lambda like polypeptide 5 | ENSG00000254709 | NA |
| The protein encoded by this gene catalyzes the transport of phosphate into the mitochondrial matrix, either by proton cotransport or in exchange for hydroxyl ions. The protein contains three related segments arranged in tandem which are related to those found in other characterized members of the mitochondrial carrier family. Both the N-terminal and C-terminal regions of this protein protrude toward the cytosol. Multiple alternatively spliced transcript variants have been isolated. | 5250 | SLC25A3 | solute carrier family 25 member 3 | ENSG00000075415 | NA |
| This gene encodes the mitochondrial enzyme which is catalyzes the rate-limiting step in heme (iron-protoporphyrin) biosynthesis. The enzyme encoded by this gene is the housekeeping enzyme; a separate gene encodes a form of the enzyme that is specific for erythroid tissue. The level of the mature encoded protein is regulated by heme: high levels of heme down-regulate the mature enzyme in mitochondria while low heme levels up-regulate. A pseudogene of this gene is located on chromosome 12. Alternative splicing results in multiple transcript variants encoding different isoforms. | 211 | ALAS1 | 5’-aminolevulinate synthase 1 | ENSG00000023330 | NA |
| This gene is a member of the Regulator of Complement Activation (RCA) gene cluster and encodes a protein with twenty short consensus repeat (SCR) domains. This protein is secreted into the bloodstream and has an essential role in the regulation of complement activation, restricting this innate defense mechanism to microbial infections. Mutations in this gene have been associated with hemolytic-uremic syndrome (HUS) and chronic hypocomplementemic nephropathy. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. | 3075 | CFH | complement factor H | ENSG00000000971 | NA |
| The protein encoded by this gene is secreted and is a serine protease inhibitor whose targets include elastase, plasmin, thrombin, trypsin, chymotrypsin, and plasminogen activator. Defects in this gene can cause emphysema or liver disease. Several transcript variants encoding the same protein have been found for this gene. | 5265 | SERPINA1 | serpin family A member 1 | ENSG00000197249 | NA |
| This gene is a member of the NADH dehydrogenase (ubiquinone) iron-sulfur protein family. The encoded protein is a subunit of the NADH:ubiquinone oxidoreductase (complex I), the first enzyme complex in the electron transport chain located in the inner mitochondrial membrane. Alternative splicing results in multiple transcript variants and pseudogenes have been identified on chromosomes 1, 4 and 17. | 4725 | NDUFS5 | NADH:ubiquinone oxidoreductase subunit S5 | ENSG00000168653 | NA |
| This gene is located in an imprinted region of chromosome 11 near the insulin-like growth factor 2 (IGF2) gene. This gene is only expressed from the maternally-inherited chromosome, whereas IGF2 is only expressed from the paternally-inherited chromosome. The product of this gene is a long non-coding RNA which functions as a tumor suppressor. Mutations in this gene have been associated with Beckwith-Wiedemann Syndrome and Wilms tumorigenesis. Alternative splicing results in multiple transcript variants. | 283120 | H19 | H19, imprinted maternally expressed transcript (non-protein coding) | ENSG00000130600 | NA |
| This gene encodes a basic helix-loop-helix protein expressed in various tissues. The encoded protein can interact with ARNTL or compete for E-box binding sites in the promoter of PER1 and repress CLOCK/ARNTL’s transactivation of PER1. This gene is believed to be involved in the control of circadian rhythm and cell differentiation. | 8553 | BHLHE40 | basic helix-loop-helix family member e40 | ENSG00000134107 | NA |
| NA | 8420 | SNHG3 | small nucleolar RNA host gene 3 | ENSG00000242125 | NA |
| This gene belongs to the family of reticulon encoding genes. Reticulons are associated with the endoplasmic reticulum, and are involved in neuroendocrine secretion or in membrane trafficking in neuroendocrine cells. The product of this gene is a potent neurite outgrowth inhibitor which may also help block the regeneration of the central nervous system in higher vertebrates. Alternatively spliced transcript variants derived both from differential splicing and differential promoter usage and encoding different isoforms have been identified. | 57142 | RTN4 | reticulon 4 | ENSG00000115310 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",5,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[6,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| query | symbol | summary | X_id | name | notfound |
|---|---|---|---|---|---|
| ENSG00000244734 | HBB | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | 3043 | hemoglobin subunit beta | NA |
| ENSG00000168542 | COL3A1 | This gene encodes the pro-alpha1 chains of type III collagen, a fibrillar collagen that is found in extensible connective tissues such as skin, lung, uterus, intestine and the vascular system, frequently in association with type I collagen. Mutations in this gene are associated with Ehlers-Danlos syndrome types IV, and with aortic and arterial aneurysms. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | 1281 | collagen type III alpha 1 chain | NA |
| ENSG00000167768 | KRT1 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3848 | keratin 1 | NA |
| ENSG00000162896 | PIGR | This gene is a member of the immunoglobulin superfamily. The encoded poly-Ig receptor binds polymeric immunoglobulin molecules at the basolateral surface of epithelial cells; the complex is then transported across the cell to be secreted at the apical surface. A significant association was found between immunoglobulin A nephropathy and several SNPs in this gene. | 5284 | polymeric immunoglobulin receptor | NA |
| ENSG00000159251 | ACTC1 | Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC). | 70 | actin, alpha, cardiac muscle 1 | NA |
| ENSG00000170323 | FABP4 | FABP4 encodes the fatty acid binding protein found in adipocytes. Fatty acid binding proteins are a family of small, highly conserved, cytoplasmic proteins that bind long-chain fatty acids and other hydrophobic ligands. It is thought that FABPs roles include fatty acid uptake, transport, and metabolism. | 2167 | fatty acid binding protein 4 | NA |
| ENSG00000059804 | SLC2A3 | NA | 6515 | solute carrier family 2 member 3 | NA |
| ENSG00000139112 | GABARAPL1 | NA | 23710 | GABA type A receptor associated protein like 1 | NA |
| ENSG00000079308 | TNS1 | The protein encoded by this gene localizes to focal adhesions, regions of the plasma membrane where the cell attaches to the extracellular matrix. This protein crosslinks actin filaments and contains a Src homology 2 (SH2) domain, which is often found in molecules involved in signal transduction. This protein is a substrate of calpain II. Alternative splicing results in multiple transcript variants encoding different isoforms. | 7145 | tensin 1 | NA |
| ENSG00000110799 | VWF | This gene encodes a glycoprotein involved in hemostasis. The encoded preproprotein is proteolytically processed following assembly into large multimeric complexes. These complexes function in the adhesion of platelets to sites of vascular injury and the transport of various proteins in the blood. Mutations in this gene result in von Willebrand disease, an inherited bleeding disorder. An unprocessed pseudogene has been found on chromosome 22. | 7450 | von Willebrand factor | NA |
| ENSG00000211890 | IGHA2 | NA | ENSG00000211890 | immunoglobulin heavy constant alpha 2 (A2m marker) | NA |
| ENSG00000096384 | HSP90AB1 | This gene encodes a member of the heat shock protein 90 family; these proteins are involved in signal transduction, protein folding and degradation and morphological evolution. This gene encodes the constitutive form of the cytosolic 90 kDa heat-shock protein and is thought to play a role in gastric apoptosis and inflammation. Alternative splicing results in multiple transcript variants. Pseudogenes have been identified on multiple chromosomes. | 3326 | heat shock protein 90kDa alpha family class B member 1 | NA |
| ENSG00000175084 | DES | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | 1674 | desmin | NA |
| ENSG00000075624 | ACTB | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | 60 | actin, beta | NA |
| ENSG00000183087 | GAS6 | This gene encodes a gamma-carboxyglutamic acid (Gla)-containing protein thought to be involved in the stimulation of cell proliferation. This gene is frequently overexpressed in many cancers and has been implicated as an adverse prognostic marker. Elevated protein levels are additionally associated with a variety of disease states, including venous thromboembolic disease, systemic lupus erythematosus, chronic renal failure, and preeclampsia. | 2621 | growth arrest specific 6 | NA |
| ENSG00000134107 | BHLHE40 | This gene encodes a basic helix-loop-helix protein expressed in various tissues. The encoded protein can interact with ARNTL or compete for E-box binding sites in the promoter of PER1 and repress CLOCK/ARNTL’s transactivation of PER1. This gene is believed to be involved in the control of circadian rhythm and cell differentiation. | 8553 | basic helix-loop-helix family member e40 | NA |
| ENSG00000204983 | PRSS1 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | 5644 | protease, serine 1 | NA |
| ENSG00000169710 | FASN | The enzyme encoded by this gene is a multifunctional protein. Its main function is to catalyze the synthesis of palmitate from acetyl-CoA and malonyl-CoA, in the presence of NADPH, into long-chain saturated fatty acids. In some cancer cell lines, this protein has been found to be fused with estrogen receptor-alpha (ER-alpha), in which the N-terminus of FAS is fused in-frame with the C-terminus of ER-alpha. | 2194 | fatty acid synthase | NA |
| ENSG00000163431 | LMOD1 | The leiomodin 1 protein has a putative membrane-spanning region and 2 types of tandemly repeated blocks. The transcript is expressed in all tissues tested, with the highest levels in thyroid, eye muscle, skeletal muscle, and ovary. Increased expression of leiomodin 1 may be linked to Graves’ disease and thyroid-associated ophthalmopathy. | 25802 | leiomodin 1 | NA |
| ENSG00000104879 | CKM | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis and is an important serum marker for myocardial infarction. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in striated muscle as well as in other tissues, and as a heterodimer with a similar brain isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. | 1158 | creatine kinase, M-type | NA |
| ENSG00000101187 | SLCO4A1 | NA | 28231 | solute carrier organic anion transporter family member 4A1 | NA |
| ENSG00000144381 | HSPD1 | This gene encodes a member of the chaperonin family. The encoded mitochondrial protein may function as a signaling molecule in the innate immune system. This protein is essential for the folding and assembly of newly imported proteins in the mitochondria. This gene is adjacent to a related family member and the region between the 2 genes functions as a bidirectional promoter. Several pseudogenes have been associated with this gene. Two transcript variants encoding the same protein have been identified for this gene. Mutations associated with this gene cause autosomal recessive spastic paraplegia 13. | 3329 | heat shock protein family D (Hsp60) member 1 | NA |
| ENSG00000196091 | MYBPC1 | This gene encodes a member of the myosin-binding protein C family. Myosin-binding protein C family members are myosin-associated proteins found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The encoded protein is the slow skeletal muscle isoform of myosin-binding protein C and plays an important role in muscle contraction by recruiting muscle-type creatine kinase to myosin filaments. Mutations in this gene are associated with distal arthrogryposis type I. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 4604 | myosin binding protein C, slow type | NA |
| ENSG00000148677 | ANKRD1 | The protein encoded by this gene is localized to the nucleus of endothelial cells and is induced by IL-1 and TNF-alpha stimulation. Studies in rat cardiomyocytes suggest that this gene functions as a transcription factor. Interactions between this protein and the sarcomeric proteins myopalladin and titin suggest that it may also be involved in the myofibrillar stretch-sensor system. | 27063 | ankyrin repeat domain 1 | NA |
| ENSG00000149591 | TAGLN | The protein encoded by this gene is a transformation and shape-change sensitive actin cross-linking/gelling protein found in fibroblasts and smooth muscle. Its expression is down-regulated in many cell lines, and this down-regulation may be an early and sensitive marker for the onset of transformation. A functional role of this protein is unclear. Two transcript variants encoding the same protein have been found for this gene. | 6876 | transgelin | NA |
| ENSG00000166825 | ANPEP | Aminopeptidase N is located in the small-intestinal and renal microvillar membrane, and also in other plasma membranes. In the small intestine aminopeptidase N plays a role in the final digestion of peptides generated from hydrolysis of proteins by gastric and pancreatic proteases. Its function in proximal tubular epithelial cells and other cell types is less clear. The large extracellular carboxyterminal domain contains a pentapeptide consensus sequence characteristic of members of the zinc-binding metalloproteinase superfamily. Sequence comparisons with known enzymes of this class showed that CD13 and aminopeptidase N are identical. The latter enzyme was thought to be involved in the metabolism of regulatory peptides by diverse cell types, including small intestinal and renal tubular epithelial cells, macrophages, granulocytes, and synaptic membranes from the CNS. Human aminopeptidase N is a receptor for one strain of human coronavirus that is an important cause of upper respiratory tract infections. Defects in this gene appear to be a cause of various types of leukemia or lymphoma. | 290 | alanyl aminopeptidase, membrane | NA |
| ENSG00000091704 | CPA1 | This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. | 1357 | carboxypeptidase A1 | NA |
| ENSG00000119508 | NR4A3 | This gene encodes a member of the steroid-thyroid hormone-retinoid receptor superfamily. The encoded protein may act as a transcriptional activator. The protein can efficiently bind the NGFI-B Response Element (NBRE). Three different versions of extraskeletal myxoid chondrosarcomas (EMCs) are the result of reciprocal translocations between this gene and other genes. The translocation breakpoints are associated with Nuclear Receptor Subfamily 4, Group A, Member 3 (on chromosome 9) and either Ewing Sarcome Breakpoint Region 1 (on chromosome 22), RNA Polymerase II, TATA Box-Binding Protein-Associated Factor, 68-KD (on chromosome 17), or Transcription factor 12 (on chromosome 15). Multiple transcript variants encoding different isoforms have been found for this gene. | 8013 | nuclear receptor subfamily 4 group A member 3 | NA |
| ENSG00000143416 | SELENBP1 | This gene encodes a member of the selenium-binding protein family. Selenium is an essential nutrient that exhibits potent anticarcinogenic properties, and deficiency of selenium may cause certain neurologic diseases. The effects of selenium in preventing cancer and neurologic diseases may be mediated by selenium-binding proteins, and decreased expression of this gene may be associated with several types of cancer. The encoded protein may play a selenium-dependent role in ubiquitination/deubiquitination-mediated protein degradation. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 8991 | selenium binding protein 1 | NA |
| ENSG00000090920 | NA | NA | NA | NA | TRUE |
| ENSG00000115386 | REG1A | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | 5967 | regenerating family member 1 alpha | NA |
| ENSG00000204628 | RACK1 | NA | 10399 | receptor for activated C kinase 1 | NA |
| ENSG00000170027 | YWHAG | This gene product belongs to the 14-3-3 family of proteins which mediate signal transduction by binding to phosphoserine-containing proteins. This highly conserved protein family is found in both plants and mammals, and this protein is 100% identical to the rat ortholog. It is induced by growth factors in human vascular smooth muscle cells, and is also highly expressed in skeletal and heart muscles, suggesting an important role for this protein in muscle tissue. It has been shown to interact with RAF1 and protein kinase C, proteins involved in various signal transduction pathways. | 7532 | tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein gamma | NA |
| ENSG00000186395 | KRT10 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | 3858 | keratin 10 | NA |
| ENSG00000068976 | PYGM | This gene encodes a muscle enzyme involved in glycogenolysis. Highly similar enzymes encoded by different genes are found in liver and brain. Mutations in this gene are associated with McArdle disease (myophosphorylase deficiency), a glycogen storage disease of muscle. Alternative splicing results in multiple transcript variants. | 5837 | phosphorylase, glycogen, muscle | NA |
| ENSG00000140403 | DNAJA4 | NA | 55466 | DnaJ heat shock protein family (Hsp40) member A4 | NA |
| ENSG00000168209 | DDIT4 | NA | 54541 | DNA damage inducible transcript 4 | NA |
| ENSG00000175535 | PNLIP | This gene is a member of the lipase gene family. It encodes a carboxyl esterase that hydrolyzes insoluble, emulsified triglycerides, and is essential for the efficient digestion of dietary fats. This gene is expressed specifically in the pancreas. | 5406 | pancreatic lipase | NA |
| ENSG00000171401 | KRT13 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | 3860 | keratin 13 | NA |
| ENSG00000105220 | GPI | This gene encodes a member of the glucose phosphate isomerase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. In the cytoplasm, the gene product functions as a glycolytic enzyme (glucose-6-phosphate isomerase) that interconverts glucose-6-phophsate and fructose-6-phosphate. Extracellularly, the encoded protein (also referred to as neuroleukin) functions as a neurotrophic factor that promotes survival of skeletal motor neurons and sensory neurons, and as a lymphokine that induces immunoglobulin secretion. The encoded protein is also referred to as autocrine motility factor based on an additional function as a tumor-secreted cytokine and angiogenic factor. Defects in this gene are the cause of nonspherocytic hemolytic anemia and a severe enzyme deficiency can be associated with hydrops fetalis, immediate neonatal death and neurological impairment. Alternative splicing results in multiple transcript variants. | 2821 | glucose-6-phosphate isomerase | NA |
| ENSG00000170477 | KRT4 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3851 | keratin 4 | NA |
| ENSG00000135046 | ANXA1 | This gene encodes a membrane-localized protein that binds phospholipids. This protein inhibits phospholipase A2 and has anti-inflammatory activity. Loss of function or expression of this gene has been detected in multiple tumors. | 301 | annexin A1 | NA |
| ENSG00000163631 | ALB | Albumin is a soluble, monomeric protein which comprises about one-half of the blood serum protein. Albumin functions primarily as a carrier protein for steroids, fatty acids, and thyroid hormones and plays a role in stabilizing extracellular fluid volume. Albumin is a globular unglycosylated serum protein of molecular weight 65,000. Albumin is synthesized in the liver as preproalbumin which has an N-terminal peptide that is removed before the nascent protein is released from the rough endoplasmic reticulum. The product, proalbumin, is in turn cleaved in the Golgi vesicles to produce the secreted albumin. | 213 | albumin | NA |
| ENSG00000120049 | KCNIP2 | This gene encodes a member of the family of voltage-gated potassium (Kv) channel-interacting proteins (KCNIPs), which belongs to the recoverin branch of the EF-hand superfamily. Members of the KCNIP family are small calcium binding proteins. They all have EF-hand-like domains, and differ from each other in the N-terminus. They are integral subunit components of native Kv4 channel complexes. They may regulate A-type currents, and hence neuronal excitability, in response to changes in intracellular calcium. Multiple alternatively spliced transcript variants encoding distinct isoforms have been identified from this gene. | 30819 | potassium voltage-gated channel interacting protein 2 | NA |
| ENSG00000142789 | CELA3A | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. | 10136 | chymotrypsin like elastase family member 3A | NA |
| ENSG00000159388 | BTG2 | The protein encoded by this gene is a member of the BTG/Tob family. This family has structurally related proteins that appear to have antiproliferative properties. This encoded protein is involved in the regulation of the G1/S transition of the cell cycle. | 7832 | BTG family member 2 | NA |
| ENSG00000107796 | ACTA2 | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | 59 | actin, alpha 2, smooth muscle, aorta | NA |
| ENSG00000118503 | TNFAIP3 | This gene was identified as a gene whose expression is rapidly induced by the tumor necrosis factor (TNF). The protein encoded by this gene is a zinc finger protein and ubiqitin-editing enzyme, and has been shown to inhibit NF-kappa B activation as well as TNF-mediated apoptosis. The encoded protein, which has both ubiquitin ligase and deubiquitinase activities, is involved in the cytokine-mediated immune and inflammatory responses. Several transcript variants encoding the same protein have been found for this gene. | 7128 | TNF alpha induced protein 3 | NA |
| ENSG00000147872 | PLIN2 | The protein encoded by this gene belongs to the perilipin family, members of which coat intracellular lipid storage droplets. This protein is associated with the lipid globule surface membrane material, and maybe involved in development and maintenance of adipose tissue. However, it is not restricted to adipocytes as previously thought, but is found in a wide range of cultured cell lines, including fibroblasts, endothelial and epithelial cells, and tissues, such as lactating mammary gland, adrenal cortex, Sertoli and Leydig cells, and hepatocytes in alcoholic liver cirrhosis, suggesting that it may serve as a marker of lipid accumulation in diverse cell types and diseases. Alternatively spliced transcript variants have been found for this gene. | 123 | perilipin 2 | NA |
| ENSG00000177469 | PTRF | This gene encodes a protein that enables the dissociation of paused ternary polymerase I transcription complexes from the 3’ end of pre-rRNA transcripts. This protein regulates rRNA transcription by promoting the dissociation of transcription complexes and the reinitiation of polymerase I on nascent rRNA transcripts. This protein also localizes to caveolae at the plasma membrane and is thought to play a critical role in the formation of caveolae and the stabilization of caveolins. This protein translocates from caveolae to the cytoplasm after insulin stimulation. Caveolae contain truncated forms of this protein and may be the site of phosphorylation-dependent proteolysis. This protein is also thought to modify lipid metabolism and insulin-regulated gene expression. Mutations in this gene result in a disorder characterized by generalized lipodystrophy and muscular dystrophy. | 284119 | polymerase I and transcript release factor | NA |
| ENSG00000163017 | ACTG2 | Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | 72 | actin, gamma 2, smooth muscle, enteric | NA |
| ENSG00000159176 | CSRP1 | This gene encodes a member of the cysteine-rich protein (CSRP) family. This gene family includes a group of LIM domain proteins, which may be involved in regulatory processes important for development and cellular differentiation. The LIM/double zinc-finger motif found in this gene product occurs in proteins with critical functions in gene regulation, cell growth, and somatic differentiation. Alternatively spliced transcript variants have been described. | 1465 | cysteine and glycine rich protein 1 | NA |
| ENSG00000177606 | JUN | This gene is the putative transforming gene of avian sarcoma virus 17. It encodes a protein which is highly similar to the viral protein, and which interacts directly with specific target DNA sequences to regulate gene expression. This gene is intronless and is mapped to 1p32-p31, a chromosomal region involved in both translocations and deletions in human malignancies. | 3725 | Jun proto-oncogene, AP-1 transcription factor subunit | NA |
| ENSG00000171747 | LGALS4 | The galectins are a family of beta-galactoside-binding proteins implicated in modulating cell-cell and cell-matrix interactions. The expression of this gene is restricted to small intestine, colon, and rectum, and it is underexpressed in colorectal cancer. | 3960 | galectin 4 | NA |
| ENSG00000169347 | GP2 | This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. | 2813 | glycoprotein 2 | NA |
| ENSG00000144655 | CSRNP1 | This gene encodes a protein that localizes to the nucleus and expression of this gene is induced in response to elevated levels of axin. The Wnt signalling pathway, which is negatively regulated by axin, is important in axis formation in early development and impaired regulation of this signalling pathway is often involved in tumors. A decreased level of expression of this gene in tumors compared to the level of expression in their corresponding normal tissues suggests that this gene product has a tumor suppressor function. Alternative splicing results in multiple transcript variants. | 64651 | cysteine and serine rich nuclear protein 1 | NA |
| ENSG00000132693 | CRP | The protein encoded by this gene belongs to the pentaxin family. It is involved in several host defense related functions based on its ability to recognize foreign pathogens and damaged cells of the host and to initiate their elimination by interacting with humoral and cellular effector systems in the blood. Consequently, the level of this protein in plasma increases greatly during acute phase response to tissue injury, infection, or other inflammatory stimuli. | 1401 | C-reactive protein, pentraxin-related | NA |
| ENSG00000112715 | VEGFA | This gene is a member of the PDGF/VEGF growth factor family. It encodes a heparin-binding protein, which exists as a disulfide-linked homodimer. This growth factor induces proliferation and migration of vascular endothelial cells, and is essential for both physiological and pathological angiogenesis. Disruption of this gene in mice resulted in abnormal embryonic blood vessel formation. This gene is upregulated in many known tumors and its expression is correlated with tumor stage and progression. Elevated levels of this protein are found in patients with POEMS syndrome, also known as Crow-Fukase syndrome. Allelic variants of this gene have been associated with microvascular complications of diabetes 1 (MVCD1) and atherosclerosis. Alternatively spliced transcript variants encoding different isoforms have been described. There is also evidence for alternative translation initiation from upstream non-AUG (CUG) codons resulting in additional isoforms. A recent study showed that a C-terminally extended isoform is produced by use of an alternative in-frame translation termination codon via a stop codon readthrough mechanism, and that this isoform is antiangiogenic. Expression of some isoforms derived from the AUG start codon is regulated by a small upstream open reading frame, which is located within an internal ribosome entry site. | 7422 | vascular endothelial growth factor A | NA |
| ENSG00000155657 | TTN | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | 7273 | titin | NA |
| ENSG00000112936 | C7 | C7 is a component of the complement system. It participates in the formation of Membrane Attack Complex (MAC). People with C7 deficiency are prone to bacterial infection. | 730 | complement component 7 | NA |
| ENSG00000197616 | MYH6 | Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. | 4624 | myosin, heavy chain 6, cardiac muscle, alpha | NA |
| ENSG00000150991 | UBC | This gene represents a ubiquitin gene, ubiquitin C. The encoded protein is a polyubiquitin precursor. Conjugation of ubiquitin monomers or polymers can lead to various effects within a cell, depending on the residues to which ubiquitin is conjugated. Ubiquitination has been associated with protein degradation, DNA repair, cell cycle regulation, kinase modification, endocytosis, and regulation of other cell signaling pathways. | 7316 | ubiquitin C | NA |
| ENSG00000122786 | CALD1 | This gene encodes a calmodulin- and actin-binding protein that plays an essential role in the regulation of smooth muscle and nonmuscle contraction. The conserved domain of this protein possesses the binding activities to Ca(2+)-calmodulin, actin, tropomyosin, myosin, and phospholipids. This protein is a potent inhibitor of the actin-tropomyosin activated myosin MgATPase, and serves as a mediating factor for Ca(2+)-dependent inhibition of smooth muscle contraction. Alternative splicing of this gene results in multiple transcript variants encoding distinct isoforms. | 800 | caldesmon 1 | NA |
| ENSG00000072110 | ACTN1 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a nonmuscle, cytoskeletal, alpha actinin isoform and maps to the same site as the structurally similar erythroid beta spectrin gene. Three transcript variants encoding different isoforms have been found for this gene. | 87 | actinin alpha 1 | NA |
| ENSG00000175445 | LPL | LPL encodes lipoprotein lipase, which is expressed in heart, muscle, and adipose tissue. LPL functions as a homodimer, and has the dual functions of triglyceride hydrolase and ligand/bridging factor for receptor-mediated lipoprotein uptake. Severe mutations that cause LPL deficiency result in type I hyperlipoproteinemia, while less extreme mutations in LPL are linked to many disorders of lipoprotein metabolism. | 4023 | lipoprotein lipase | NA |
| ENSG00000153002 | CPB1 | Three different procarboxypeptidases A and two different procarboxypeptidases B have been isolated. The B1 and B2 forms differ from each other mainly in isoelectric point. Carboxypeptidase B1 is a highly tissue-specific protein and is a useful serum marker for acute pancreatitis and dysfunction of pancreatic transplants. It is not elevated in pancreatic carcinoma. | 1360 | carboxypeptidase B1 | NA |
| ENSG00000158050 | DUSP2 | The protein encoded by this gene is a member of the dual specificity protein phosphatase subfamily. These phosphatases inactivate their target kinases by dephosphorylating both the phosphoserine/threonine and phosphotyrosine residues. They negatively regulate members of the mitogen-activated protein (MAP) kinase superfamily (MAPK/ERK, SAPK/JNK, p38), which are associated with cellular proliferation and differentiation. Different members of the family of dual specificity phosphatases show distinct substrate specificities for various MAP kinases, different tissue distribution and subcellular localization, and different modes of inducibility of their expression by extracellular stimuli. This gene product inactivates ERK1 and ERK2, is predominantly expressed in hematopoietic tissues, and is localized in the nucleus. | 1844 | dual specificity phosphatase 2 | NA |
| ENSG00000196531 | NACA | This gene encodes a protein that associates with basic transcription factor 3 (BTF3) to form the nascent polypeptide-associated complex (NAC). This complex binds to nascent proteins that lack a signal peptide motif as they emerge from the ribosome, blocking interaction with the signal recognition particle (SRP) and preventing mistranslocation to the endoplasmic reticulum. This protein is an IgE autoantigen in atopic dermatitis patients. Alternative splicing results in multiple transcript variants, but the full length nature of some of these variants, including those encoding very large proteins, has not been determined. There are multiple pseudogenes of this gene on different chromosomes. | 4666 | nascent polypeptide-associated complex alpha subunit | NA |
| ENSG00000120738 | EGR1 | The protein encoded by this gene belongs to the EGR family of C2H2-type zinc-finger proteins. It is a nuclear protein and functions as a transcriptional regulator. The products of target genes it activates are required for differentitation and mitogenesis. Studies suggest this is a cancer suppressor gene. | 1958 | early growth response 1 | NA |
| ENSG00000134339 | SAA2 | NA | 6289 | serum amyloid A2 | NA |
| ENSG00000138356 | AOX1 | Aldehyde oxidase produces hydrogen peroxide and, under certain conditions, can catalyze the formation of superoxide. Aldehyde oxidase is a candidate gene for amyotrophic lateral sclerosis. | 316 | aldehyde oxidase 1 | NA |
| ENSG00000138207 | RBP4 | This protein belongs to the lipocalin family and is the specific carrier for retinol (vitamin A alcohol) in the blood. It delivers retinol from the liver stores to the peripheral tissues. In plasma, the RBP-retinol complex interacts with transthyretin which prevents its loss by filtration through the kidney glomeruli. A deficiency of vitamin A blocks secretion of the binding protein posttranslationally and results in defective delivery and supply to the epidermal cells. | 5950 | retinol binding protein 4 | NA |
| ENSG00000112149 | CD83 | The protein encoded by this gene is a single-pass type I membrane protein and member of the immunoglobulin superfamily of receptors. The encoded protein may be involved in the regulation of antigen presentation. A soluble form of this protein can bind to dendritic cells and inhibit their maturation. Three transcript variants encoding different isoforms have been found for this gene. | 9308 | CD83 molecule | NA |
| ENSG00000143549 | TPM3 | This gene encodes a member of the tropomyosin family of actin-binding proteins. Tropomyosins are dimers of coiled-coil proteins that provide stability to actin filaments and regulate access of other actin-binding proteins. Mutations in this gene result in autosomal dominant nemaline myopathy and other muscle disorders. This locus is involved in translocations with other loci, including anaplastic lymphoma receptor tyrosine kinase (ALK) and neurotrophic tyrosine kinase receptor type 1 (NTRK1), which result in the formation of fusion proteins that act as oncogenes. There are numerous pseudogenes for this gene on different chromosomes. Alternative splicing results in multiple transcript variants. | 7170 | tropomyosin 3 | NA |
| ENSG00000134571 | MYBPC3 | MYBPC3 encodes the cardiac isoform of myosin-binding protein C. Myosin-binding protein C is a myosin-associated protein found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. MYBPC3, the cardiac isoform, is expressed exclussively in heart muscle. Regulatory phosphorylation of the cardiac isoform in vivo by cAMP-dependent protein kinase (PKA) upon adrenergic stimulation may be linked to modulation of cardiac contraction. Mutations in MYBPC3 are one cause of familial hypertrophic cardiomyopathy. | 4607 | myosin binding protein C, cardiac | NA |
| ENSG00000170989 | S1PR1 | The protein encoded by this gene is structurally similar to G protein-coupled receptors and is highly expressed in endothelial cells. It binds the ligand sphingosine-1-phosphate with high affinity and high specificity, and suggested to be involved in the processes that regulate the differentiation of endothelial cells. Activation of this receptor induces cell-cell adhesion. Alternative splicing results in multiple transcript variants. | 1901 | sphingosine-1-phosphate receptor 1 | NA |
| ENSG00000014641 | MDH1 | This gene encodes an enzyme that catalyzes the NAD/NADH-dependent, reversible oxidation of malate to oxaloacetate in many metabolic pathways, including the citric acid cycle. Two main isozymes are known to exist in eukaryotic cells: one is found in the mitochondrial matrix and the other in the cytoplasm. This gene encodes the cytosolic isozyme, which plays a key role in the malate-aspartate shuttle that allows malate to pass through the mitochondrial membrane to be transformed into oxaloacetate for further cellular processes. Alternatively spliced transcript variants have been found for this gene. A recent study showed that a C-terminally extended isoform is produced by use of an alternative in-frame translation termination codon via a stop codon readthrough mechanism, and that this isoform is localized in the peroxisomes. Pseudogenes have been identified on chromosomes X and 6. | 4190 | malate dehydrogenase 1 | NA |
| ENSG00000164056 | SPRY1 | NA | 10252 | sprouty RTK signaling antagonist 1 | NA |
| ENSG00000137392 | CLPS | The protein encoded by this gene is a cofactor needed by pancreatic lipase for efficient dietary lipid hydrolysis. It binds to the C-terminal, non-catalytic domain of lipase, thereby stabilizing an active conformation and considerably increasing the overall hydrophobic binding site. The gene product allows lipase to anchor noncovalently to the surface of lipid micelles, counteracting the destabilizing influence of intestinal bile salts. This cofactor is only expressed in pancreatic acinar cells, suggesting regulation of expression by tissue-specific elements. Three transcript variants encoding different isoforms have been found for this gene. | 1208 | colipase | NA |
| ENSG00000109061 | MYH1 | Myosin is a major contractile protein which converts chemical energy into mechanical energy through the hydrolysis of ATP. Myosin is a hexameric protein composed of a pair of myosin heavy chains (MYH) and two pairs of nonidentical light chains. Myosin heavy chains are encoded by a multigene family. In mammals at least 10 different myosin heavy chain (MYH) isoforms have been described from striated, smooth, and nonmuscle cells. These isoforms show expression that is spatially and temporally regulated during development. | 4619 | myosin, heavy chain 1, skeletal muscle, adult | NA |
| ENSG00000147459 | DOCK5 | NA | 80005 | dedicator of cytokinesis 5 | NA |
| ENSG00000269926 | RP11-442H21.2 | NA | ENSG00000269926 | NA | NA |
| ENSG00000151914 | DST | This gene encodes a member of the plakin protein family of adhesion junction plaque proteins. Multiple alternatively spliced transcript variants encoding distinct isoforms have been found for this gene, but the full-length nature of some variants has not been defined. It has been reported that some isoforms are expressed in neural and muscle tissue, anchoring neural intermediate filaments to the actin cytoskeleton, and some isoforms are expressed in epithelial tissue, anchoring keratin-containing intermediate filaments to hemidesmosomes. Consistent with the expression, mice defective for this gene show skin blistering and neurodegeneration. | 667 | dystonin | NA |
| ENSG00000125730 | C3 | Complement component C3 plays a central role in the activation of complement system. Its activation is required for both classical and alternative complement activation pathways. The encoded preproprotein is proteolytically processed to generate alpha and beta subunits that form the mature protein, which is then further processed to generate numerous peptide products. The C3a peptide, also known as the C3a anaphylatoxin, modulates inflammation and possesses antimicrobial activity. Mutations in this gene are associated with atypical hemolytic uremic syndrome and age-related macular degeneration in human patients. | 718 | complement component 3 | NA |
| ENSG00000219073 | CELA3B | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3B has little elastolytic activity. Like most of the human elastases, elastase 3B is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3B preferentially cleaves proteins after alanine residues. Elastase 3B may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1, and excretion of this protein in fecal material is frequently used as a measure of pancreatic function in clinical assays. | 23436 | chymotrypsin like elastase family member 3B | NA |
| ENSG00000125414 | MYH2 | Myosins are actin-based motor proteins that function in the generation of mechanical force in eukaryotic cells. Muscle myosins are heterohexamers composed of 2 myosin heavy chains and 2 pairs of nonidentical myosin light chains. This gene encodes a member of the class II or conventional myosin heavy chains, and functions in skeletal muscle contraction. This gene is found in a cluster of myosin heavy chain genes on chromosome 17. A mutation in this gene results in inclusion body myopathy-3. Multiple alternatively spliced variants, encoding the same protein, have been identified. | 4620 | myosin, heavy chain 2, skeletal muscle, adult | NA |
| ENSG00000089157 | RPLP0 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein, which is the functional equivalent of the E. coli L10 ribosomal protein, belongs to the L10P family of ribosomal proteins. It is a neutral phosphoprotein with a C-terminal end that is nearly identical to the C-terminal ends of the acidic ribosomal phosphoproteins P1 and P2. The P0 protein can interact with P1 and P2 to form a pentameric complex consisting of P1 and P2 dimers, and a P0 monomer. The protein is located in the cytoplasm. Transcript variants derived from alternative splicing exist; they encode the same protein. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | 6175 | ribosomal protein lateral stalk subunit P0 | NA |
| ENSG00000118194 | TNNT2 | The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. Mutations in this gene have been associated with familial hypertrophic cardiomyopathy as well as with dilated cardiomyopathy. Transcripts for this gene undergo alternative splicing that results in many tissue-specific isoforms, however, the full-length nature of some of these variants has not yet been determined. | 7139 | troponin T2, cardiac type | NA |
| ENSG00000170315 | UBB | This gene encodes ubiquitin, one of the most conserved proteins known. Ubiquitin has a major role in targeting cellular proteins for degradation by the 26S proteosome. It is also involved in the maintenance of chromatin structure, the regulation of gene expression, and the stress response. Ubiquitin is synthesized as a precursor protein consisting of either polyubiquitin chains or a single ubiquitin moiety fused to an unrelated protein. This gene consists of three direct repeats of the ubiquitin coding sequence with no spacer sequence. Consequently, the protein is expressed as a polyubiquitin precursor with a final amino acid after the last repeat. An aberrant form of this protein has been detected in patients with Alzheimer’s disease and Down syndrome. Pseudogenes of this gene are located on chromosomes 1, 2, 13, and 17. Alternative splicing results in multiple transcript variants. | 7314 | ubiquitin B | NA |
| ENSG00000035862 | TIMP2 | This gene is a member of the TIMP gene family. The proteins encoded by this gene family are natural inhibitors of the matrix metalloproteinases, a group of peptidases involved in degradation of the extracellular matrix. In addition to an inhibitory role against metalloproteinases, the encoded protein has a unique role among TIMP family members in its ability to directly suppress the proliferation of endothelial cells. As a result, the encoded protein may be critical to the maintenance of tissue homeostasis by suppressing the proliferation of quiescent tissues in response to angiogenic factors, and by inhibiting protease activity in tissues undergoing remodelling of the extracellular matrix. | 7077 | TIMP metallopeptidase inhibitor 2 | NA |
| ENSG00000166923 | GREM1 | This gene encodes a member of the BMP (bone morphogenic protein) antagonist family. Like BMPs, BMP antagonists contain cystine knots and typically form homo- and heterodimers. The CAN (cerberus and dan) subfamily of BMP antagonists, to which this gene belongs, is characterized by a C-terminal cystine knot with an eight-membered ring. The antagonistic effect of the secreted glycosylated protein encoded by this gene is likely due to its direct binding to BMP proteins. As an antagonist of BMP, this gene may play a role in regulating organogenesis, body patterning, and tissue differentiation. In mouse, this protein has been shown to relay the sonic hedgehog (SHH) signal from the polarizing region to the apical ectodermal ridge during limb bud outgrowth. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 26585 | gremlin 1, DAN family BMP antagonist | NA |
| ENSG00000135447 | PPP1R1A | NA | 5502 | protein phosphatase 1 regulatory inhibitor subunit 1A | NA |
| ENSG00000259716 | NA | NA | NA | NA | TRUE |
| ENSG00000197971 | MBP | The protein encoded by the classic MBP gene is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. However, MBP-related transcripts are also present in the bone marrow and the immune system. These mRNAs arise from the long MBP gene (otherwise called ‘Golli-MBP’) that contains 3 additional exons located upstream of the classic MBP exons. Alternative splicing from the Golli and the MBP transcription start sites gives rise to 2 sets of MBP-related transcripts and gene products. The Golli mRNAs contain 3 exons unique to Golli-MBP, spliced in-frame to 1 or more MBP exons. They encode hybrid proteins that have N-terminal Golli aa sequence linked to MBP aa sequence. The second family of transcripts contain only MBP exons and produce the well characterized myelin basic proteins. This complex gene structure is conserved among species suggesting that the MBP transcription unit is an integral part of the Golli transcription unit and that this arrangement is important for the function and/or regulation of these genes. | 4155 | myelin basic protein | NA |
| ENSG00000106211 | HSPB1 | The protein encoded by this gene is induced by environmental stress and developmental changes. The encoded protein is involved in stress resistance and actin organization and translocates from the cytoplasm to the nucleus upon stress induction. Defects in this gene are a cause of Charcot-Marie-Tooth disease type 2F (CMT2F) and distal hereditary motor neuropathy (dHMN). | 3315 | heat shock protein family B (small) member 1 | NA |
| ENSG00000135842 | FAM129A | NA | 116496 | family with sequence similarity 129 member A | NA |
| ENSG00000129353 | SLC44A2 | NA | 57153 | solute carrier family 44 member 2 | NA |
| ENSG00000111341 | MGP | The protein encoded by this gene is secreted and likely acts as an inhibitor of bone formation. The encoded protein is found in the organic matrix of bone and cartilage. Defects in this gene are a cause of Keutel syndrome (KS). Two transcript variants encoding different isoforms have been found for this gene. | 4256 | matrix Gla protein | NA |
| ENSG00000070756 | PABPC1 | This gene encodes a poly(A) binding protein. The protein shuttles between the nucleus and cytoplasm and binds to the 3’ poly(A) tail of eukaryotic messenger RNAs via RNA-recognition motifs. The binding of this protein to poly(A) promotes ribosome recruitment and translation initiation; it is also required for poly(A) shortening which is the first step in mRNA decay. The gene is part of a small gene family including three protein-coding genes and several pseudogenes. | 26986 | poly(A) binding protein cytoplasmic 1 | NA |
| ENSG00000115541 | HSPE1 | This gene encodes a major heat shock protein which functions as a chaperonin. Its structure consists of a heptameric ring which binds to another heat shock protein in order to form a symmetric, functional heterodimer which enhances protein folding in an ATP-dependent manner. This gene and its co-chaperonin, HSPD1, are arranged in a head-to-head orientation on chromosome 2. Naturally occurring read-through transcription occurs between this locus and the neighboring locus MOBKL3. | 3336 | heat shock protein family E (Hsp10) member 1 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",6,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[7,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
kable(as.data.frame(out))
| name | summary | X_id | query | symbol |
|---|---|---|---|---|
| keratin 13 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | 3860 | ENSG00000171401 | KRT13 |
| keratin 4 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3851 | ENSG00000170477 | KRT4 |
| small proline rich protein 3 | NA | 6707 | ENSG00000163209 | SPRR3 |
| NA | NA | ENSG00000229732 | ENSG00000229732 | AC019349.5 |
| keratin 6A | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. As many as six of this type II cytokeratin (KRT6) have been identified; the multiplicity of the genes is attributed to successive gene duplication events. The genes are expressed with family members KRT16 and/or KRT17 in the filiform papillae of the tongue, the stratified epithelial lining of oral mucosa and esophagus, the outer root sheath of hair follicles, and the glandular epithelia. This KRT6 gene in particular encodes the most abundant isoform. Mutations in these genes have been associated with pachyonychia congenita. In addition, peptides from the C-terminal region of the protein have antimicrobial activity against bacterial pathogens. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3853 | ENSG00000205420 | KRT6A |
| S100 calcium binding protein A9 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and altered expression of this protein is associated with the disease cystic fibrosis. This antimicrobial protein exhibits antifungal and antibacterial activity. | 6280 | ENSG00000163220 | S100A9 |
| annexin A1 | This gene encodes a membrane-localized protein that binds phospholipids. This protein inhibits phospholipase A2 and has anti-inflammatory activity. Loss of function or expression of this gene has been detected in multiple tumors. | 301 | ENSG00000135046 | ANXA1 |
| cornulin | This gene encodes a member of the ‘fused gene’ family of proteins, which contain N-terminus EF-hand domains and multiple tandem peptide repeats. The encoded protein contains two EF-hand Ca2+ binding domains in its N-terminus and two glutamine- and threonine-rich 60 amino acid repeats in its C-terminus. This gene, also known as squamous epithelial heat shock protein 53, may play a role in the mucosal/epithelial immune response and epidermal differentiation. | 49860 | ENSG00000143536 | CRNN |
| Rh family C glycoprotein | NA | 51458 | ENSG00000140519 | RHCG |
| S100 calcium binding protein A8 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and as a cytokine. Altered expression of this protein is associated with the disease cystic fibrosis. Multiple transcript variants encoding different isoforms have been found for this gene. | 6279 | ENSG00000143546 | S100A8 |
| cystatin B | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins and kininogens. This gene encodes a stefin that functions as an intracellular thiol protease inhibitor. The protein is able to form a dimer stabilized by noncovalent forces, inhibiting papain and cathepsins l, h and b. The protein is thought to play a role in protecting against the proteases leaking from lysosomes. Evidence indicates that mutations in this gene are responsible for the primary defects in patients with progressive myoclonic epilepsy (EPM1). | 1476 | ENSG00000160213 | CSTB |
| epithelial membrane protein 1 | NA | 2012 | ENSG00000134531 | EMP1 |
| keratin 10 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | 3858 | ENSG00000186395 | KRT10 |
| keratin 1 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3848 | ENSG00000167768 | KRT1 |
| S100 calcium binding protein A14 | This gene encodes a member of the S100 protein family which contains an EF-hand motif and binds calcium. The gene is located in a cluster of S100 genes on chromosome 1. Levels of the encoded protein have been found to be lower in cancerous tissue and associated with metastasis suggesting a tumor suppressor function (PMID: 19956863, 19351828). | 57402 | ENSG00000189334 | S100A14 |
| mal, T-cell differentiation protein | The protein encoded by this gene is a highly hydrophobic integral membrane protein belonging to the MAL family of proteolipids. The protein has been localized to the endoplasmic reticulum of T-cells and is a candidate linker protein in T-cell signal transduction. In addition, this proteolipid is localized in compact myelin of cells in the nervous system and has been implicated in myelin biogenesis and/or function. The protein plays a role in the formation, stabilization and maintenance of glycosphingolipid-enriched membrane microdomains. Down-regulation of this gene has been associated with a variety of human epithelial malignancies. Alternative splicing produces four transcript variants which vary from each other by the presence or absence of alternatively spliced exons 2 and 3. | 4118 | ENSG00000172005 | MAL |
| transglutaminase 3 | Transglutaminases are enzymes that catalyze the crosslinking of proteins by epsilon-gamma glutamyl lysine isopeptide bonds. While the primary structure of transglutaminases is not conserved, they all have the same amino acid sequence at their active sites and their activity is calcium-dependent. The protein encoded by this gene consists of two polypeptide chains activated from a single precursor protein by proteolysis. The encoded protein is involved the later stages of cell envelope formation in the epidermis and hair follicle. | 7053 | ENSG00000125780 | TGM3 |
| small proline rich protein 2A | NA | 6700 | ENSG00000241794 | SPRR2A |
| cystatin A | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins, and kininogens. This gene encodes a stefin that functions as a cysteine protease inhibitor, forming tight complexes with papain and the cathepsins B, H, and L. The protein is one of the precursor proteins of cornified cell envelope in keratinocytes and plays a role in epidermal development and maintenance. Stefins have been proposed as prognostic and diagnostic tools for cancer. | 1475 | ENSG00000121552 | CSTA |
| small proline rich protein 1A | NA | 6698 | ENSG00000169474 | SPRR1A |
| keratin 2 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3849 | ENSG00000172867 | KRT2 |
| extracellular matrix protein 1 | This gene encodes a soluble protein that is involved in endochondral bone formation, angiogenesis, and tumor biology. It also interacts with a variety of extracellular and structural proteins, contributing to the maintenance of skin integrity and homeostasis. Mutations in this gene are associated with lipoid proteinosis disorder (also known as hyalinosis cutis et mucosae or Urbach-Wiethe disease) that is characterized by generalized thickening of skin, mucosae and certain viscera. Alternatively spliced transcript variants encoding distinct isoforms have been described for this gene. | 1893 | ENSG00000143369 | ECM1 |
| family with sequence similarity 129 member B | NA | 64855 | ENSG00000136830 | FAM129B |
| S100 calcium binding protein A11 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in motility, invasion, and tubulin polymerization. Chromosomal rearrangements and altered expression of this gene have been implicated in tumor metastasis. | 6282 | ENSG00000163191 | S100A11 |
| S100 calcium binding protein A16 | NA | 140576 | ENSG00000188643 | S100A16 |
| interleukin 1 receptor antagonist | The protein encoded by this gene is a member of the interleukin 1 cytokine family. This protein inhibits the activities of interleukin 1, alpha (IL1A) and interleukin 1, beta (IL1B), and modulates a variety of interleukin 1 related immune and inflammatory responses. This gene and five other closely related cytokine genes form a gene cluster spanning approximately 400 kb on chromosome 2. A polymorphism of this gene is reported to be associated with increased risk of osteoporotic fractures and gastric cancer. Several alternatively spliced transcript variants encoding distinct isoforms have been reported. | 3557 | ENSG00000136689 | IL1RN |
| gap junction protein beta 2 | This gene encodes a member of the gap junction protein family. The gap junctions were first characterized by electron microscopy as regionally specialized structures on plasma membranes of contacting adherent cells. These structures were shown to consist of cell-to-cell channels that facilitate the transfer of ions and small molecules between cells. The gap junction proteins, also known as connexins, purified from fractions of enriched gap junctions from different tissues differ. According to sequence similarities at the nucleotide and amino acid levels, the gap junction proteins are divided into two categories, alpha and beta. Mutations in this gene are responsible for as much as 50% of pre-lingual, recessive deafness. | 2706 | ENSG00000165474 | GJB2 |
| serine peptidase inhibitor, Kazal type 5 | This gene encodes a multidomain serine protease inhibitor that contains 15 potential inhibitory domains. The encoded preproprotein is proteolytically processed to generate multiple protein products, which may exhibit unique activities and specificities. These proteins may play a role in skin and hair morphogenesis, as well as anti-inflammatory and antimicrobial protection of mucous epithelia. Mutations in this gene may result in Netherton syndrome, a disorder characterized by ichthyosis, defective cornification, and atopy. This gene is present in a gene cluster on chromosome 5. Alternative splicing results in multiple transcript variants. | 11005 | ENSG00000133710 | SPINK5 |
| periplakin | The protein encoded by this gene is a component of desmosomes and of the epidermal cornified envelope in keratinocytes. The N-terminal domain of this protein interacts with the plasma membrane and its C-terminus interacts with intermediate filaments. Through its rod domain, this protein forms complexes with envoplakin. This protein may serve as a link between the cornified envelope and desmosomes as well as intermediate filaments. AKT1/PKB, a protein kinase mediating a variety of cell growth and survival signaling processes, is reported to interact with this protein, suggesting a possible role for this protein as a localization signal in AKT1-mediated signaling. | 5493 | ENSG00000118898 | PPL |
| desmocollin 2 | This gene encodes a member of the desmocollin protein subfamily. Desmocollins, along with desmogleins, are cadherin-like transmembrane glycoproteins that are major components of the desmosome. Desmosomes are cell-cell junctions that help resist shearing forces and are found in high concentrations in cells subject to mechanical stress. This gene is found in a cluster with other desmocollin family members on chromosome 18. Mutations in this gene are associated with arrhythmogenic right ventricular dysplasia-11, and reduced protein expression has been described in several types of cancer. Alternative splicing results in multiple transcript variants. | 1824 | ENSG00000134755 | DSC2 |
| fatty acid binding protein 5 pseudogene 7 | NA | ENSG00000234964 | ENSG00000234964 | FABP5P7 |
| S100 calcium binding protein A2 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may have a tumor suppressor function. Chromosomal rearrangements and altered expression of this gene have been implicated in breast cancer. | 6273 | ENSG00000196754 | S100A2 |
| stratifin | NA | 2810 | ENSG00000175793 | SFN |
| peptidase inhibitor 3 | This gene encodes an elastase-specific inhibitor that functions as an antimicrobial peptide against Gram-positive and Gram-negative bacteria, and fungal pathogens. The protein contains a WAP-type four-disulfide core (WFDC) domain, and is thus a member of the WFDC domain family. Most WFDC gene members are localized to chromosome 20q12-q13 in two clusters: centromeric and telomeric. This gene belongs to the centromeric cluster. Expression of this gene is upgulated by bacterial lipopolysaccharides and cytokines. | 5266 | ENSG00000124102 | PI3 |
| keratin 19 | The protein encoded by this gene is a member of the keratin family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. The type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. Unlike its related family members, this smallest known acidic cytokeratin is not paired with a basic cytokeratin in epithelial cells. It is specifically expressed in the periderm, the transiently superficial layer that envelopes the developing epidermis. The type I cytokeratins are clustered in a region of chromosome 17q12-q21. | 3880 | ENSG00000171345 | KRT19 |
| transglutaminase 1 | The protein encoded by this gene is a membrane protein that catalyzes the addition of an alkyl group from an akylamine to a glutamine residue of a protein, forming an alkylglutamine in the protein. This protein alkylation leads to crosslinking of proteins and catenation of polyamines to proteins. This gene contains either one or two copies of a 22 nt repeat unit in its 3’ UTR. Mutations in this gene have been associated with autosomal recessive lamellar ichthyosis (LI) and nonbullous congenital ichthyosiform erythroderma (NCIE). | 7051 | ENSG00000092295 | TGM1 |
| phosphogluconate dehydrogenase | 6-phosphogluconate dehydrogenase is the second dehydrogenase in the pentose phosphate shunt. Deficiency of this enzyme is generally asymptomatic, and the inheritance of this disorder is autosomal dominant. Hemolysis results from combined deficiency of 6-phosphogluconate dehydrogenase and 6-phosphogluconolactonase suggesting a synergism of the two enzymopathies. Several transcript variants encoding different isoforms have been found for this gene. | 5226 | ENSG00000142657 | PGD |
| lipocalin 2 | This gene encodes a protein that belongs to the lipocalin family. Members of this family transport small hydrophobic molecules such as lipids, steroid hormones and retinoids. The protein encoded by this gene is a neutrophil gelatinase-associated lipocalin and plays a role in innate immunity by limiting bacterial growth as a result of sequestering iron-containing siderophores. The presence of this protein in blood and urine is an early biomarker of acute kidney injury. This protein is thought to be be involved in multiple cellular processes, including maintenance of skin homeostasis, and suppression of invasiveness and metastasis. Mice lacking this gene are more susceptible to bacterial infection than wild type mice. | 3934 | ENSG00000148346 | LCN2 |
| keratin 15 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains and are clustered in a region on chromosome 17q21.2. | 3866 | ENSG00000171346 | KRT15 |
| secretory leukocyte peptidase inhibitor | This gene encodes a secreted inhibitor which protects epithelial tissues from serine proteases. It is found in various secretions including seminal plasma, cervical mucus, and bronchial secretions, and has affinity for trypsin, leukocyte elastase, and cathepsin G. Its inhibitory effect contributes to the immune response by protecting epithelial surfaces from attack by endogenous proteolytic enzymes. This antimicrobial protein has antibacterial, antifungal and antiviral activity. | 6590 | ENSG00000124107 | SLPI |
| ras homolog family member B | NA | 388 | ENSG00000143878 | RHOB |
| aquaporin 3 (Gill blood group) | This gene encodes the water channel protein aquaporin 3. Aquaporins are a family of small integral membrane proteins related to the major intrinsic protein, also known as aquaporin 0. Aquaporin 3 is localized at the basal lateral membranes of collecting duct cells in the kidney. In addition to its water channel function, aquaporin 3 has been found to facilitate the transport of nonionic small solutes such as urea and glycerol, but to a smaller degree. It has been suggested that water channels can be functionally heterogeneous and possess water and solute permeation mechanisms. Alternative splicing of this gene results in multiple transcript variants encoding different isoforms. | 360 | ENSG00000165272 | AQP3 |
| serpin family B member 1 | The protein encoded by this gene is a member of the serpin family of proteinase inhibitors. Members of this family maintain homeostasis by neutralizing overexpressed proteinase activity through their function as suicide substrates. This protein inhibits the neutrophil-derived proteinases neutrophil elastase, cathepsin G, and proteinase-3 and thus protects tissues from damage at inflammatory sites. Alternative splicing results in multiple transcript variants. | 1992 | ENSG00000021355 | SERPINB1 |
| aldehyde dehydrogenase 3 family member A1 | Aldehyde dehydrogenases oxidize various aldehydes to the corresponding acids. They are involved in the detoxification of alcohol-derived acetaldehyde and in the metabolism of corticosteroids, biogenic amines, neurotransmitters, and lipid peroxidation. The enzyme encoded by this gene forms a cytoplasmic homodimer that preferentially oxidizes aromatic and medium-chain (6 carbons or more) saturated and unsaturated aldehyde substrates. It is thought to promote resistance to UV and 4-hydroxy-2-nonenal-induced oxidative damage in the cornea. The gene is located within the Smith-Magenis syndrome region on chromosome 17. Multiple alternatively spliced variants, encoding the same protein, have been identified. | 218 | ENSG00000108602 | ALDH3A1 |
| tumor-associated calcium signal transducer 2 | This intronless gene encodes a carcinoma-associated antigen. This antigen is a cell surface receptor that transduces calcium signals. Mutations of this gene have been associated with gelatinous drop-like corneal dystrophy. | 4070 | ENSG00000184292 | TACSTD2 |
| small proline rich protein 1B | The protein encoded by this gene is an envelope protein of keratinocytes. The encoded protein is crosslinked to membrane proteins by transglutaminase, forming an insoluble layer under the plasma membrane. This protein is proline-rich and contains several tandem amino acid repeats. | 6699 | ENSG00000169469 | SPRR1B |
| granulin | Granulins are a family of secreted, glycosylated peptides that are cleaved from a single precursor protein with 7.5 repeats of a highly conserved 12-cysteine granulin/epithelin motif. The 88 kDa precursor protein, progranulin, is also called proepithelin and PC cell-derived growth factor. Cleavage of the signal peptide produces mature granulin which can be further cleaved into a variety of active, 6 kDa peptides. These smaller cleavage products are named granulin A, granulin B, granulin C, etc. Epithelins 1 and 2 are synonymous with granulins A and B, respectively. Both the peptides and intact granulin protein regulate cell growth. However, different members of the granulin protein family may act as inhibitors, stimulators, or have dual actions on cell growth. Granulin family members are important in normal development, wound healing, and tumorigenesis. | 2896 | ENSG00000030582 | GRN |
| loricrin | This gene encodes loricrin, a major protein component of the cornified cell envelope found in terminally differentiated epidermal cells. Mutations in this gene are associated with Vohwinkel’s syndrome and progressive symmetric erythrokeratoderma, both inherited skin diseases. | 4014 | ENSG00000203782 | LOR |
| cellular retinoic acid binding protein 2 | This gene encodes a member of the retinoic acid (RA, a form of vitamin A) binding protein family and lipocalin/cytosolic fatty-acid binding protein family. The protein is a cytosol-to-nuclear shuttling protein, which facilitates RA binding to its cognate receptor complex and transfer to the nucleus. It is involved in the retinoid signaling pathway, and is associated with increased circulating low-density lipoprotein cholesterol. Alternatively spliced transcript variants encoding the same protein have been found for this gene. | 1382 | ENSG00000143320 | CRABP2 |
| annexin A2 | This gene encodes a member of the annexin family. Members of this calcium-dependent phospholipid-binding protein family play a role in the regulation of cellular growth and in signal transduction pathways. This protein functions as an autocrine factor which heightens osteoclast formation and bone resorption. This gene has three pseudogenes located on chromosomes 4, 9 and 10, respectively. Multiple alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 302 | ENSG00000182718 | ANXA2 |
| S100 calcium binding protein A10 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in exocytosis and endocytosis. | 6281 | ENSG00000197747 | S100A10 |
| protease, serine 27 | This gene is located within a large protease gene cluster on chromosome 16. It belongs to the group-1 subfamily of serine proteases. The encoded protein is a secreted tryptic serine protease and is expressed mainly in the pancreas. Alternative splicing results in multiple transcript variants. | 83886 | ENSG00000172382 | PRSS27 |
| EPS8 like 1 | This gene encodes a protein that is related to epidermal growth factor receptor pathway substrate 8 (EPS8), a substrate for the epidermal growth factor receptor. The function of this protein is unknown. At least two alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 54869 | ENSG00000131037 | EPS8L1 |
| catenin delta 1 | This gene encodes a member of the Armadillo protein family, which function in adhesion between cells and signal transduction. Multiple translation initiation codons and alternative splicing result in many different isoforms being translated. Not all of the full-length natures of the described transcript variants have been determined. Read-through transcription also exists between this gene and the neighboring upstream thioredoxin-related transmembrane protein 2 (TMX2) gene. | 1500 | ENSG00000198561 | CTNND1 |
| fatty acid binding protein 5 | This gene encodes the fatty acid binding protein found in epidermal cells, and was first identified as being upregulated in psoriasis tissue. Fatty acid binding proteins are a family of small, highly conserved, cytoplasmic proteins that bind long-chain fatty acids and other hydrophobic ligands. FABPs may play roles in fatty acid uptake, transport, and metabolism. Polymorphisms in this gene are associated with type 2 diabetes. The human genome contains many pseudogenes similar to this locus. | 2171 | ENSG00000164687 | FABP5 |
| calmodulin like 3 | NA | 810 | ENSG00000178363 | CALML3 |
| cornifelin | NA | 84518 | ENSG00000105427 | CNFN |
| myelin protein zero like 2 | Thymus development depends on a complex series of interactions between thymocytes and the stromal component of the organ. Epithelial V-like antigen (EVA) is expressed in thymus epithelium and strongly downregulated by thymocyte developmental progression. This gene is expressed in the thymus and in several epithelial structures early in embryogenesis. It is highly homologous to the myelin protein zero and, in thymus-derived epithelial cell lines, is poorly soluble in nonionic detergents, strongly suggesting an association to the cytoskeleton. Its capacity to mediate cell adhesion through a homophilic interaction and its selective regulation by T cell maturation might imply the participation of EVA in the earliest phases of thymus organogenesis. The protein bears a characteristic V-type domain and two potential N-glycosylation sites in the extracellular domain; a putative serine phosphorylation site for casein kinase 2 is also present in the cytoplasmic tail. Two transcript variants encoding the same protein have been found for this gene. | 10205 | ENSG00000149573 | MPZL2 |
| EPS8 like 2 | This gene encodes a member of the EPS8 gene family. The encoded protein, like other members of the family, is thought to link growth factor stimulation to actin organization, generating functional redundancy in the pathways that regulate actin cytoskeletal remodeling. | 64787 | ENSG00000177106 | EPS8L2 |
| RAB10, member RAS oncogene family | RAB10 belongs to the RAS (see HRAS; MIM 190020) superfamily of small GTPases. RAB proteins localize to exocytic and endocytic compartments and regulate intracellular vesicle trafficking (Bao et al., 1998 [PubMed 9918381]). | 10890 | ENSG00000084733 | RAB10 |
| fatty acyl-CoA reductase 1 | The protein encoded by this gene is required for the reduction of fatty acids to fatty alcohols, a process that is required for the synthesis of monoesters and ether lipids. NADPH is required as a cofactor in this reaction, and 16-18 carbon saturated and unsaturated fatty acids are the preferred substrate. This is a peroxisomal membrane protein, and studies suggest that the N-terminus contains a large catalytic domain located on the outside of the peroxisome, while the C-terminus is exposed to the matrix of the peroxisome. Studies indicate that the regulation of this protein is dependent on plasmalogen levels. Mutations in this gene have been associated with individuals affected by severe intellectual disability, early-onset epilepsy, microcephaly, congenital cataracts, growth retardation, and spasticity (PMID: 25439727). A pseudogene of this gene is located on chromosome 13. | 84188 | ENSG00000197601 | FAR1 |
| EPH receptor A2 | This gene belongs to the ephrin receptor subfamily of the protein-tyrosine kinase family. EPH and EPH-related receptors have been implicated in mediating developmental events, particularly in the nervous system. Receptors in the EPH subfamily typically have a single kinase domain and an extracellular region containing a Cys-rich domain and 2 fibronectin type III repeats. The ephrin receptors are divided into 2 groups based on the similarity of their extracellular domain sequences and their affinities for binding ephrin-A and ephrin-B ligands. This gene encodes a protein that binds ephrin-A ligands. Mutations in this gene are the cause of certain genetically-related cataract disorders. | 1969 | ENSG00000142627 | EPHA2 |
| transmembrane protein 45A | NA | 55076 | ENSG00000181458 | TMEM45A |
| actin binding LIM protein 1 | This gene encodes a cytoskeletal LIM protein that binds to actin filaments via a domain that is homologous to erythrocyte dematin. LIM domains, found in over 60 proteins, play key roles in the regulation of developmental pathways. LIM domains also function as protein-binding interfaces, mediating specific protein-protein interactions. The protein encoded by this gene could mediate such interactions between actin filaments and cytoplasmic targets. Alternatively spliced transcript variants encoding different isoforms have been identified. | 3983 | ENSG00000099204 | ABLIM1 |
| NA | NA | ENSG00000249007 | ENSG00000249007 | RP11-510N19.5 |
| cysteine rich C-terminal 1 | NA | 54544 | ENSG00000169509 | CRCT1 |
| absent in melanoma 1 | NA | 202 | ENSG00000112297 | AIM1 |
| calmodulin like 5 | This gene encodes a novel calcium binding protein expressed in the epidermis and related to the calmodulin family of calcium binding proteins. Functional studies with recombinant protein demonstrate it does bind calcium and undergoes a conformational change when it does so. Abundant expression is detected only in reconstructed epidermis and is restricted to differentiating keratinocytes. In addition, it can associate with transglutaminase 3, shown to be a key enzyme in the terminal differentiation of keratinocytes. | 51806 | ENSG00000178372 | CALML5 |
| aspartic peptidase, retroviral-like 1 | NA | 151516 | ENSG00000244617 | ASPRV1 |
| Rho GTPase activating protein 27 | This gene encodes a member of a large family of proteins that activate Rho-type guanosine triphosphate (GTP) metabolizing enzymes. The encoded protein may pay a role in clathrin-mediated endocytosis. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 201176 | ENSG00000159314 | ARHGAP27 |
| quiescin sulfhydryl oxidase 1 | This gene encodes a protein that contains domains of thioredoxin and ERV1, members of two long-standing gene families. The gene expression is induced as fibroblasts begin to exit the proliferative cycle and enter quiescence, suggesting that this gene plays an important role in growth regulation. Two transcript variants encoding two different isoforms have been found for this gene. | 5768 | ENSG00000116260 | QSOX1 |
| endoplasmic reticulum oxidoreductase alpha | NA | 30001 | ENSG00000197930 | ERO1A |
| galectin 3 binding protein | The galectins are a family of beta-galactoside-binding proteins implicated in modulating cell-cell and cell-matrix interactions. LGALS3BP has been found elevated in the serum of patients with cancer and in those infected by the human immunodeficiency virus (HIV). It appears to be implicated in immune response associated with natural killer (NK) and lymphokine-activated killer (LAK) cell cytotoxicity. Using fluorescence in situ hybridization the full length 90K cDNA has been localized to chromosome 17q25. The native protein binds specifically to a human macrophage-associated lectin known as Mac-2 and also binds galectin 1. | 3959 | ENSG00000108679 | LGALS3BP |
| GIPC PDZ domain containing family member 1 | GIPC1 is a scaffolding protein that regulates cell surface receptor expression and trafficking (Lee et al., 2008 [PubMed 18775991]). | 10755 | ENSG00000123159 | GIPC1 |
| keratin 5 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the basal layer of the epidermis with family member KRT14. Mutations in these genes have been associated with a complex of diseases termed epidermolysis bullosa simplex. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3852 | ENSG00000186081 | KRT5 |
| prominin 2 | This gene encodes a member of the prominin family of pentaspan membrane glycoproteins. The encoded protein localizes to basal epithelial cells and may be involved in the organization of plasma membrane microdomains. Alternative splicing results in multiple transcript variants. | 150696 | ENSG00000155066 | PROM2 |
| carboxylesterase 2 | This gene encodes a member of the carboxylesterase large family. The family members are responsible for the hydrolysis or transesterification of various xenobiotics, such as cocaine and heroin, and endogenous substrates with ester, thioester, or amide bonds. They may participate in fatty acyl and cholesterol ester metabolism, and may play a role in the blood-brain barrier system. The protein encoded by this gene is the major intestinal enzyme and functions in intestine drug clearance. Alternatively spliced transcript variants have been found for this gene. | 8824 | ENSG00000172831 | CES2 |
| ATP binding cassette subfamily C member 5 | The protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intra-cellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the MRP subfamily which is involved in multi-drug resistance. This protein functions in the cellular export of its substrate, cyclic nucleotides. This export contributes to the degradation of phosphodiesterases and possibly an elimination pathway for cyclic nucleotides. Studies show that this protein provides resistance to thiopurine anticancer drugs, 6-mercatopurine and thioguanine, and the anti-HIV drug 9-(2-phosphonylmethoxyethyl)adenine. This protein may be involved in resistance to thiopurines in acute lymphoblastic leukemia and antiretroviral nucleoside analogs in HIV-infected patients. Alternative splicing results in multiple transcript variants. | 10057 | ENSG00000114770 | ABCC5 |
| carcinoembryonic antigen related cell adhesion molecule 1 | This gene encodes a member of the carcinoembryonic antigen (CEA) gene family, which belongs to the immunoglobulin superfamily. Two subgroups of the CEA family, the CEA cell adhesion molecules and the pregnancy-specific glycoproteins, are located within a 1.2 Mb cluster on the long arm of chromosome 19. Eleven pseudogenes of the CEA cell adhesion molecule subgroup are also found in the cluster. The encoded protein was originally described in bile ducts of liver as biliary glycoprotein. Subsequently, it was found to be a cell-cell adhesion molecule detected on leukocytes, epithelia, and endothelia. The encoded protein mediates cell adhesion via homophilic as well as heterophilic binding to other proteins of the subgroup. Multiple cellular activities have been attributed to the encoded protein, including roles in the differentiation and arrangement of tissue three-dimensional structure, angiogenesis, apoptosis, tumor suppression, metastasis, and the modulation of innate and adaptive immune responses. Multiple transcript variants encoding different isoforms have been reported, but the full-length nature of all variants has not been defined. | 634 | ENSG00000079385 | CEACAM1 |
| dermcidin | This antimicrobial gene encodes a secreted protein that is subsequently processed into mature peptides of distinct biological activities. The C-terminal peptide is constitutively expressed in sweat and has antibacterial and antifungal activities. The N-terminal peptide, also known as diffusible survival evasion peptide, promotes neural cell survival under conditions of severe oxidative stress. A glycosylated form of the N-terminal peptide may be associated with cachexia (muscle wasting) in cancer patients. Alternative splicing results in multiple transcript variants encoding different isoforms. | 117159 | ENSG00000161634 | DCD |
| NAD(P)H quinone dehydrogenase 1 | This gene is a member of the NAD(P)H dehydrogenase (quinone) family and encodes a cytoplasmic 2-electron reductase. This FAD-binding protein forms homodimers and reduces quinones to hydroquinones. This protein’s enzymatic activity prevents the one electron reduction of quinones that results in the production of radical species. Mutations in this gene have been associated with tardive dyskinesia (TD), an increased risk of hematotoxicity after exposure to benzene, and susceptibility to various forms of cancer. Altered expression of this protein has been seen in many tumors and is also associated with Alzheimer’s disease (AD). Alternate transcriptional splice variants, encoding different isoforms, have been characterized. | 1728 | ENSG00000181019 | NQO1 |
| inter-alpha-trypsin inhibitor heavy chain 3 | This gene encodes the heavy chain subunit of the pre-alpha-trypsin inhibitor complex. This complex may stabilize the extracellular matrix through its ability to bind hyaluronic acid. Polymorphisms of this gene may be associated with increased risk for schizophrenia and major depressive disorder. This gene is present in an inter-alpha-trypsin inhibitor family gene cluster on chromosome 3. | 3699 | ENSG00000162267 | ITIH3 |
| albumin | Albumin is a soluble, monomeric protein which comprises about one-half of the blood serum protein. Albumin functions primarily as a carrier protein for steroids, fatty acids, and thyroid hormones and plays a role in stabilizing extracellular fluid volume. Albumin is a globular unglycosylated serum protein of molecular weight 65,000. Albumin is synthesized in the liver as preproalbumin which has an N-terminal peptide that is removed before the nascent protein is released from the rough endoplasmic reticulum. The product, proalbumin, is in turn cleaved in the Golgi vesicles to produce the secreted albumin. | 213 | ENSG00000163631 | ALB |
| thioredoxin | The protein encoded by this gene acts as a homodimer and is involved in many redox reactions. The encoded protein is active in the reversible S-nitrosylation of cysteines in certain proteins, which is part of the response to intracellular nitric oxide. This protein is found in the cytoplasm. Two transcript variants encoding different isoforms have been found for this gene. | 7295 | ENSG00000136810 | TXN |
| keratin 16 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains and are clustered in a region of chromosome 17q12-q21. This keratin has been coexpressed with keratin 14 in a number of epithelial tissues, including esophagus, tongue, and hair follicles. Mutations in this gene are associated with type 1 pachyonychia congenita, non-epidermolytic palmoplantar keratoderma and unilateral palmoplantar verrucous nevus. | 3868 | ENSG00000186832 | KRT16 |
| RAB25, member RAS oncogene family | The protein encoded by this gene is a member of the RAS superfamily of small GTPases. The encoded protein is involved in membrane trafficking and cell survival. This gene has been found to be a tumor suppressor and an oncogene, depending on the context. Two variants, one protein-coding and the other not, have been found for this gene. | 57111 | ENSG00000132698 | RAB25 |
| vimentin | This gene encodes a member of the intermediate filament family. Intermediate filamentents, along with microtubules and actin microfilaments, make up the cytoskeleton. The protein encoded by this gene is responsible for maintaining cell shape, integrity of the cytoplasm, and stabilizing cytoskeletal interactions. It is also involved in the immune response, and controls the transport of low-density lipoprotein (LDL)-derived cholesterol from a lysosome to the site of esterification. It functions as an organizer of a number of critical proteins involved in attachment, migration, and cell signaling. Mutations in this gene causes a dominant, pulverulent cataract. | 7431 | ENSG00000026025 | VIM |
| keratinocyte differentiation associated protein | This gene encodes a protein which may function in the regulation of keratinocyte differentiation and maintenance of stratified epithelia. Multiple transcript variants encoding different isoforms have been found for this gene. | 388533 | ENSG00000188508 | KRTDAP |
| calpain 2 | The calpains, calcium-activated neutral proteases, are nonlysosomal, intracellular cysteine proteases. The mammalian calpains include ubiquitous, stomach-specific, and muscle-specific proteins. The ubiquitous enzymes consist of heterodimers with distinct large, catalytic subunits associated with a common small, regulatory subunit. This gene encodes the large subunit of the ubiquitous enzyme, calpain 2. Multiple heterogeneous transcriptional start sites in the 5’ UTR have been reported. Two transcript variants encoding different isoforms have been found for this gene. | 824 | ENSG00000162909 | CAPN2 |
| calpain 1 | The calpains, calcium-activated neutral proteases, are nonlysosomal, intracellular cysteine proteases. The mammalian calpains include ubiquitous, stomach-specific, and muscle-specific proteins. The ubiquitous enzymes consist of heterodimers with distinct large, catalytic subunits associated with a common small, regulatory subunit. This gene encodes the large subunit of the ubiquitous enzyme, calpain 1. Several transcript variants encoding two different isoforms have been found for this gene. | 823 | ENSG00000014216 | CAPN1 |
| tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein zeta | This gene product belongs to the 14-3-3 family of proteins which mediate signal transduction by binding to phosphoserine-containing proteins. This highly conserved protein family is found in both plants and mammals, and this protein is 99% identical to the mouse, rat and sheep orthologs. The encoded protein interacts with IRS1 protein, suggesting a role in regulating insulin sensitivity. Several transcript variants that differ in the 5’ UTR but that encode the same protein have been identified for this gene. | 7534 | ENSG00000164924 | YWHAZ |
| cytochrome P450 family 4 subfamily F member 29, pseudogene | NA | 54055 | ENSG00000228314 | CYP4F29P |
| prostate stem cell antigen | This gene encodes a glycosylphosphatidylinositol-anchored cell membrane glycoprotein. In addition to being highly expressed in the prostate it is also expressed in the bladder, placenta, colon, kidney, and stomach. This gene is up-regulated in a large proportion of prostate cancers and is also detected in cancers of the bladder and pancreas. This gene includes a polymorphism that results in an upstream start codon in some individuals; this polymorphism is thought to be associated with a risk for certain gastric and bladder cancers. Alternative splicing results in multiple transcript variants. | 8000 | ENSG00000167653 | PSCA |
| tumor protein p53 inducible protein 3 | The protein encoded by this gene is similar to oxidoreductases, which are enzymes involved in cellular responses to oxidative stresses and irradiation. This gene is induced by the tumor suppressor p53 and is thought to be involved in p53-mediated cell death. It contains a p53 consensus binding site in its promoter region and a downstream pentanucleotide microsatellite sequence. P53 has been shown to transcriptionally activate this gene by interacting with the downstream pentanucleotide microsatellite sequence. The microsatellite is polymorphic, with a varying number of pentanucleotide repeats directly correlated with the extent of transcriptional activation by p53. It has been suggested that the microsatellite polymorphism may be associated with differential susceptibility to cancer. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 9540 | ENSG00000115129 | TP53I3 |
| claudin 1 | Tight junctions represent one mode of cell-to-cell adhesion in epithelial or endothelial cell sheets, forming continuous seals around cells and serving as a physical barrier to prevent solutes and water from passing freely through the paracellular space. These junctions are comprised of sets of continuous networking strands in the outwardly facing cytoplasmic leaflet, with complementary grooves in the inwardly facing extracytoplasmic leaflet. The protein encoded by this gene, a member of the claudin family, is an integral membrane protein and a component of tight junction strands. Loss of function mutations result in neonatal ichthyosis-sclerosing cholangitis syndrome. | 9076 | ENSG00000163347 | CLDN1 |
| claudin 7 | This gene encodes a member of the claudin family. Claudins are integral membrane proteins and components of tight junction strands. Tight junction strands serve as a physical barrier to prevent solutes and water from passing freely through the paracellular space between epithelial or endothelial cell sheets, and also play critical roles in maintaining cell polarity and signal transductions. Differential expression of this gene has been observed in different types of malignancies, including breast cancer, ovarian cancer, hepatocellular carcinomas, urinary tumors, prostate cancer, lung cancer, head and neck cancers, thyroid carcinomas, etc.. Alternatively spliced transcript variants encoding different isoforms have been found. | 1366 | ENSG00000181885 | CLDN7 |
| annexin A3 | This gene encodes a member of the annexin family. Members of this calcium-dependent phospholipid-binding protein family play a role in the regulation of cellular growth and in signal transduction pathways. This protein functions in the inhibition of phopholipase A2 and cleavage of inositol 1,2-cyclic phosphate to form inositol 1-phosphate. This protein may also play a role in anti-coagulation. | 306 | ENSG00000138772 | ANXA3 |
| sterile alpha motif domain containing 9 | This gene encodes a sterile alpha motif domain-containing protein. The encoded protein localizes to the cytoplasm and may play a role in regulating cell proliferation and apoptosis. Mutations in this gene are the cause of normophosphatemic familial tumoral calcinosis. Alternate splicing results in multiple transcript variants that encode the same protein. | 54809 | ENSG00000205413 | SAMD9 |
| Kruppel like factor 5 | This gene encodes a member of the Kruppel-like factor subfamily of zinc finger proteins. The encoded protein is a transcriptional activator that binds directly to a specific recognition motif in the promoters of target genes. This protein acts downstream of multiple different signaling pathways and is regulated by post-translational modification. It may participate in both promoting and suppressing cell proliferation. Expression of this gene may be changed in a variety of different cancers and in cardiovascular disease. Alternative splicing results in multiple transcript variants. | 688 | ENSG00000102554 | KLF5 |
| V-set and immunoglobulin domain containing 10 like | NA | 147645 | ENSG00000186806 | VSIG10L |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",7,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[8,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
kable(as.data.frame(out))
| X_id | summary | name | symbol | query |
|---|---|---|---|---|
| 72 | Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | actin, gamma 2, smooth muscle, enteric | ACTG2 | ENSG00000163017 |
| 1832 | This gene encodes a protein that anchors intermediate filaments to desmosomal plaques and forms an obligate component of functional desmosomes. Mutations in this gene are the cause of several cardiomyopathies and keratodermas, including skin fragility-woolly hair syndrome. Alternative splicing results in multiple transcript variants. | desmoplakin | DSP | ENSG00000096696 |
| 64065 | NA | PERP, TP53 apoptosis effector | PERP | ENSG00000112378 |
| 3855 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the simple epithelia lining the cavities of the internal organs and in the gland ducts and blood vessels. The genes encoding the type II cytokeratins are clustered in a region of chromosome 12q12-q13. Alternative splicing may result in several transcript variants; however, not all variants have been fully described. | keratin 7 | KRT7 | ENSG00000135480 |
| 4625 | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | myosin, heavy chain 7, cardiac muscle, beta | MYH7 | ENSG00000092054 |
| 84525 | The protein encoded by this gene is a homeodomain protein that lacks certain conserved residues required for DNA binding. It was reported that choriocarcinoma cell lines and tissues failed to express this gene, which suggested the possible involvement of this gene in malignant conversion of placental trophoblasts. Studies in mice suggest that this protein may interact with serum response factor (SRF) and modulate SRF-dependent cardiac-specific gene expression and cardiac development. Multiple alternatively spliced transcript variants have been identified for this gene. | HOP homeobox | HOPX | ENSG00000171476 |
| 125 | The protein encoded by this gene is a member of the alcohol dehydrogenase family. Members of this enzyme family metabolize a wide variety of substrates, including ethanol, retinol, other aliphatic alcohols, hydroxysteroids, and lipid peroxidation products. This encoded protein, consisting of several homo- and heterodimers of alpha, beta, and gamma subunits, exhibits high activity for ethanol oxidation and plays a major role in ethanol catabolism. Three genes encoding alpha, beta and gamma subunits are tandemly organized in a genomic segment as a gene cluster. Two transcript variants encoding different isoforms have been found for this gene. | alcohol dehydrogenase 1B (class I), beta polypeptide | ADH1B | ENSG00000196616 |
| 1465 | This gene encodes a member of the cysteine-rich protein (CSRP) family. This gene family includes a group of LIM domain proteins, which may be involved in regulatory processes important for development and cellular differentiation. The LIM/double zinc-finger motif found in this gene product occurs in proteins with critical functions in gene regulation, cell growth, and somatic differentiation. Alternatively spliced transcript variants have been described. | cysteine and glycine rich protein 1 | CSRP1 | ENSG00000159176 |
| 59 | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | actin, alpha 2, smooth muscle, aorta | ACTA2 | ENSG00000107796 |
| 23650 | The protein encoded by this gene belongs to the TRIM protein family. It has multiple zinc finger motifs and a leucine zipper motif. It has been proposed to form homo- or heterodimers which are involved in nucleic acid binding. Thus, it may act as a transcriptional regulatory factor involved in carcinogenesis and/or differentiation. It may also function in the suppression of radiosensitivity since it is associated with ataxia telangiectasia phenotype. | tripartite motif containing 29 | TRIM29 | ENSG00000137699 |
| 11187 | This gene encodes a member of the arm-repeat (armadillo) and plakophilin gene families. Plakophilin proteins contain numerous armadillo repeats, localize to cell desmosomes and nuclei, and participate in linking cadherins to intermediate filaments in the cytoskeleton. This protein may act in cellular desmosome-dependent adhesion and signaling pathways. Two transcript variants encoding different isoforms have been found for this gene. | plakophilin 3 | PKP3 | ENSG00000184363 |
| 3860 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | keratin 13 | KRT13 | ENSG00000171401 |
| 53905 | The protein encoded by this gene is a glycoprotein and a member of the NADPH oxidase family. The synthesis of thyroid hormone is catalyzed by a protein complex located at the apical membrane of thyroid follicular cells. This complex contains an iodide transporter, thyroperoxidase, and a peroxide generating system that includes proteins encoded by this gene and the similar DUOX2 gene. This protein is known as dual oxidase because it has both a peroxidase homology domain and a gp91phox domain. This protein generates hydrogen peroxide and thereby plays a role in the activity of thyroid peroxidase, lactoperoxidase, and in lactoperoxidase-mediated antimicrobial defense at mucosal surfaces. Two alternatively spliced transcript variants encoding the same protein have been described for this gene. | dual oxidase 1 | DUOX1 | ENSG00000137857 |
| 960 | The protein encoded by this gene is a cell-surface glycoprotein involved in cell-cell interactions, cell adhesion and migration. It is a receptor for hyaluronic acid (HA) and can also interact with other ligands, such as osteopontin, collagens, and matrix metalloproteinases (MMPs). This protein participates in a wide variety of cellular functions including lymphocyte activation, recirculation and homing, hematopoiesis, and tumor metastasis. Transcripts for this gene undergo complex alternative splicing that results in many functionally distinct isoforms, however, the full length nature of some of these variants has not been determined. Alternative splicing is the basis for the structural and functional diversity of this protein, and may be related to tumor metastasis. | CD44 molecule (Indian blood group) | CD44 | ENSG00000026508 |
| 54869 | This gene encodes a protein that is related to epidermal growth factor receptor pathway substrate 8 (EPS8), a substrate for the epidermal growth factor receptor. The function of this protein is unknown. At least two alternatively spliced transcript variants encoding different isoforms have been found for this gene. | EPS8 like 1 | EPS8L1 | ENSG00000131037 |
| 6288 | This gene encodes a member of the serum amyloid A family of apolipoproteins. The encoded preproprotein is proteolytically processed to generate the mature protein. This protein is a major acute phase protein that is highly expressed in response to inflammation and tissue injury. This protein also plays an important role in HDL metabolism and cholesterol homeostasis. High levels of this protein are associated with chronic inflammatory diseases including atherosclerosis, rheumatoid arthritis, Alzheimer’s disease and Crohn’s disease. This protein may also be a potential biomarker for certain tumors. Alternate splicing results in multiple transcript variants that encode the same protein. A pseudogene of this gene is found on chromosome 11. | serum amyloid A1 | SAA1 | ENSG00000173432 |
| 5317 | This gene encodes a member of the arm-repeat (armadillo) and plakophilin gene families. Plakophilin proteins contain numerous armadillo repeats, localize to cell desmosomes and nuclei, and participate in linking cadherins to intermediate filaments in the cytoskeleton. This protein may be involved in molecular recruitment and stabilization during desmosome formation. Mutations in this gene have been associated with the ectodermal dysplasia/skin fragility syndrome. Two transcript variants encoding different isoforms have been found for this gene. | plakophilin 1 | PKP1 | ENSG00000081277 |
| 2752 | The protein encoded by this gene belongs to the glutamine synthetase family. It catalyzes the synthesis of glutamine from glutamate and ammonia in an ATP-dependent reaction. This protein plays a role in ammonia and glutamate detoxification, acid-base homeostasis, cell signaling, and cell proliferation. Glutamine is an abundant amino acid, and is important to the biosynthesis of several amino acids, pyrimidines, and purines. Mutations in this gene are associated with congenital glutamine deficiency, and overexpression of this gene was observed in some primary liver cancer samples. There are six pseudogenes of this gene found on chromosomes 2, 5, 9, 11, and 12. Alternative splicing results in multiple transcript variants. | glutamate-ammonia ligase | GLUL | ENSG00000135821 |
| 9289 | This gene encodes a member of the G protein-coupled receptor family and regulates brain cortical patterning. The encoded protein binds specifically to transglutaminase 2, a component of tissue and tumor stroma implicated as an inhibitor of tumor progression. Mutations in this gene are associated with a brain malformation known as bilateral frontoparietal polymicrogyria. Alternative splicing results in multiple transcript variants. | adhesion G protein-coupled receptor G1 | ADGRG1 | ENSG00000205336 |
| 6319 | This gene encodes an enzyme involved in fatty acid biosynthesis, primarily the synthesis of oleic acid. The protein belongs to the fatty acid desaturase family and is an integral membrane protein located in the endoplasmic reticulum. Transcripts of approximately 3.9 and 5.2 kb, differing only by alternative polyadenlyation signals, have been detected. A gene encoding a similar enzyme is located on chromosome 4 and a pseudogene of this gene is located on chromosome 17. | stearoyl-CoA desaturase | SCD | ENSG00000099194 |
| 25946 | Zinc finger proteins, such as ZNF385A, are regulatory proteins that act as transcription factors, bind single- or double-stranded RNA, or interact with other proteins (Sharma et al., 2004 [PubMed 15527981]). | zinc finger protein 385A | ZNF385A | ENSG00000161642 |
| 5265 | The protein encoded by this gene is secreted and is a serine protease inhibitor whose targets include elastase, plasmin, thrombin, trypsin, chymotrypsin, and plasminogen activator. Defects in this gene can cause emphysema or liver disease. Several transcript variants encoding the same protein have been found for this gene. | serpin family A member 1 | SERPINA1 | ENSG00000197249 |
| 374897 | NA | suprabasin | SBSN | ENSG00000189001 |
| 93099 | This gene is upregulated in inflammatory diseases, and it was first observed as expressed in the differentiated layers of skin. The most interesting aspect of this gene is the differential use of promoters and terminators to generate isoforms with unique cellular distributions and domain components. Alternatively spliced transcript variants encoding different isoforms have been identified for this gene. | dermokine | DMKN | ENSG00000161249 |
| 57111 | The protein encoded by this gene is a member of the RAS superfamily of small GTPases. The encoded protein is involved in membrane trafficking and cell survival. This gene has been found to be a tumor suppressor and an oncogene, depending on the context. Two variants, one protein-coding and the other not, have been found for this gene. | RAB25, member RAS oncogene family | RAB25 | ENSG00000132698 |
| 7038 | Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. | thyroglobulin | TG | ENSG00000042832 |
| 171024 | NA | synaptopodin 2 | SYNPO2 | ENSG00000172403 |
| 2261 | This gene encodes a member of the fibroblast growth factor receptor (FGFR) family, with its amino acid sequence being highly conserved between members and among divergent species. FGFR family members differ from one another in their ligand affinities and tissue distribution. A full-length representative protein would consist of an extracellular region, composed of three immunoglobulin-like domains, a single hydrophobic membrane-spanning segment and a cytoplasmic tyrosine kinase domain. The extracellular portion of the protein interacts with fibroblast growth factors, setting in motion a cascade of downstream signals, ultimately influencing mitogenesis and differentiation. This particular family member binds acidic and basic fibroblast growth hormone and plays a role in bone development and maintenance. Mutations in this gene lead to craniosynostosis and multiple types of skeletal dysplasia. Three alternatively spliced transcript variants that encode different protein isoforms have been described. | fibroblast growth factor receptor 3 | FGFR3 | ENSG00000068078 |
| 5617 | This gene encodes the anterior pituitary hormone prolactin. This secreted hormone is a growth regulator for many tissues, including cells of the immune system. It may also play a role in cell survival by suppressing apoptosis, and it is essential for lactation. Alternative splicing results in multiple transcript variants that encode the same protein. | prolactin | PRL | ENSG00000172179 |
| 1158 | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis and is an important serum marker for myocardial infarction. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in striated muscle as well as in other tissues, and as a heterodimer with a similar brain isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. | creatine kinase, M-type | CKM | ENSG00000104879 |
| 1843 | The expression of DUSP1 gene is induced in human skin fibroblasts by oxidative/heat stress and growth factors. It specifies a protein with structural features similar to members of the non-receptor-type protein-tyrosine phosphatase family, and which has significant amino-acid sequence similarity to a Tyr/Ser-protein phosphatase encoded by the late gene H1 of vaccinia virus. The bacterially expressed and purified DUSP1 protein has intrinsic phosphatase activity, and specifically inactivates mitogen-activated protein (MAP) kinase in vitro by the concomitant dephosphorylation of both its phosphothreonine and phosphotyrosine residues. Furthermore, it suppresses the activation of MAP kinase by oncogenic ras in extracts of Xenopus oocytes. Thus, DUSP1 may play an important role in the human cellular response to environmental stress as well as in the negative regulation of cellular proliferation. | dual specificity phosphatase 1 | DUSP1 | ENSG00000120129 |
| 4070 | This intronless gene encodes a carcinoma-associated antigen. This antigen is a cell surface receptor that transduces calcium signals. Mutations of this gene have been associated with gelatinous drop-like corneal dystrophy. | tumor-associated calcium signal transducer 2 | TACSTD2 | ENSG00000184292 |
| 2597 | This gene encodes a member of the glyceraldehyde-3-phosphate dehydrogenase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. The product of this gene catalyzes an important energy-yielding step in carbohydrate metabolism, the reversible oxidative phosphorylation of glyceraldehyde-3-phosphate in the presence of inorganic phosphate and nicotinamide adenine dinucleotide (NAD). The encoded protein has additionally been identified to have uracil DNA glycosylase activity in the nucleus. Also, this protein contains a peptide that has antimicrobial activity against E. coli, P. aeruginosa, and C. albicans. Studies of a similar protein in mouse have assigned a variety of additional functions including nitrosylation of nuclear proteins, the regulation of mRNA stability, and acting as a transferrin receptor on the cell surface of macrophage. Many pseudogenes similar to this locus are present in the human genome. Alternative splicing results in multiple transcript variants. | glyceraldehyde-3-phosphate dehydrogenase | GAPDH | ENSG00000111640 |
| 213 | Albumin is a soluble, monomeric protein which comprises about one-half of the blood serum protein. Albumin functions primarily as a carrier protein for steroids, fatty acids, and thyroid hormones and plays a role in stabilizing extracellular fluid volume. Albumin is a globular unglycosylated serum protein of molecular weight 65,000. Albumin is synthesized in the liver as preproalbumin which has an N-terminal peptide that is removed before the nascent protein is released from the rough endoplasmic reticulum. The product, proalbumin, is in turn cleaved in the Golgi vesicles to produce the secreted albumin. | albumin | ALB | ENSG00000163631 |
| 3309 | The protein encoded by this gene is a member of the heat shock protein 70 (HSP70) family. It is localized in the lumen of the endoplasmic reticulum (ER), and is involved in the folding and assembly of proteins in the ER. As this protein interacts with many ER proteins, it may play a key role in monitoring protein transport through the cell. | heat shock protein family A (Hsp70) member 5 | HSPA5 | ENSG00000044574 |
| 1401 | The protein encoded by this gene belongs to the pentaxin family. It is involved in several host defense related functions based on its ability to recognize foreign pathogens and damaged cells of the host and to initiate their elimination by interacting with humoral and cellular effector systems in the blood. Consequently, the level of this protein in plasma increases greatly during acute phase response to tissue injury, infection, or other inflammatory stimuli. | C-reactive protein, pentraxin-related | CRP | ENSG00000132693 |
| 7169 | This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | tropomyosin 2 (beta) | TPM2 | ENSG00000198467 |
| 2243 | This gene encodes the alpha subunit of the coagulation factor fibrinogen, which is a component of the blood clot. Following vascular injury, the encoded preproprotein is proteolytically processed by thrombin during the conversion of fibrinogen to fibrin. Mutations in this gene lead to several disorders, including dysfibrinogenemia, hypofibrinogenemia, afibrinogenemia and renal amyloidosis. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. | fibrinogen alpha chain | FGA | ENSG00000171560 |
| 3880 | The protein encoded by this gene is a member of the keratin family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. The type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. Unlike its related family members, this smallest known acidic cytokeratin is not paired with a basic cytokeratin in epithelial cells. It is specifically expressed in the periderm, the transiently superficial layer that envelopes the developing epidermis. The type I cytokeratins are clustered in a region of chromosome 17q12-q21. | keratin 19 | KRT19 | ENSG00000171345 |
| 7094 | This gene encodes a cytoskeletal protein that is concentrated in areas of cell-substratum and cell-cell contacts. The encoded protein plays a significant role in the assembly of actin filaments and in spreading and migration of various cell types, including fibroblasts and osteoclasts. It codistributes with integrins in the cell surface membrane in order to assist in the attachment of adherent cells to extracellular matrices and of lymphocytes to other cells. The N-terminus of this protein contains elements for localization to cell-extracellular matrix junctions. The C-terminus contains binding sites for proteins such as beta-1-integrin, actin, and vinculin. | talin 1 | TLN1 | ENSG00000137076 |
| 3848 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | keratin 1 | KRT1 | ENSG00000167768 |
| 2697 | This gene is a member of the connexin gene family. The encoded protein is a component of gap junctions, which are composed of arrays of intercellular channels that provide a route for the diffusion of low molecular weight materials from cell to cell. The encoded protein is the major protein of gap junctions in the heart that are thought to have a crucial role in the synchronized contraction of the heart and in embryonic development. A related intronless pseudogene has been mapped to chromosome 5. Mutations in this gene have been associated with oculodentodigital dysplasia, autosomal recessive craniometaphyseal dysplasia and heart malformations. | gap junction protein alpha 1 | GJA1 | ENSG00000152661 |
| 83959 | This gene encodes a voltage-regulated, electrogenic sodium-coupled borate cotransporter that is essential for borate homeostasis, cell growth and cell proliferation. Mutations in this gene have been associated with a number of endothelial corneal dystrophies including recessive corneal endothelial dystrophy 2, corneal dystrophy and perceptive deafness, and Fuchs endothelial corneal dystrophy. Multiple transcript variants encoding different isoforms have been described. | solute carrier family 4 member 11 | SLC4A11 | ENSG00000088836 |
| 3960 | The galectins are a family of beta-galactoside-binding proteins implicated in modulating cell-cell and cell-matrix interactions. The expression of this gene is restricted to small intestine, colon, and rectum, and it is underexpressed in colorectal cancer. | galectin 4 | LGALS4 | ENSG00000171747 |
| 4359 | This gene is specifically expressed in Schwann cells of the peripheral nervous system and encodes a type I transmembrane glycoprotein that is a major structural protein of the peripheral myelin sheath. The encoded protein contains a large hydrophobic extracellular domain and a smaller basic intracellular domain, which are essential for the formation and stabilization of the multilamellar structure of the compact myelin. Mutations in this gene are associated with autosomal dominant form of Charcot-Marie-Tooth disease type 1 (CMT1B) and other polyneuropathies, such as Dejerine-Sottas syndrome (DSS) and congenital hypomyelinating neuropathy (CHN). A recent study showed that two isoforms are produced from the same mRNA by use of alternative in-frame translation termination codons via a stop codon readthrough mechanism. | myelin protein zero | MPZ | ENSG00000158887 |
| 476 | The protein encoded by this gene belongs to the family of P-type cation transport ATPases, and to the subfamily of Na+/K+ -ATPases. Na+/K+ -ATPase is an integral membrane protein responsible for establishing and maintaining the electrochemical gradients of Na and K ions across the plasma membrane. These gradients are essential for osmoregulation, for sodium-coupled transport of a variety of organic and inorganic molecules, and for electrical excitability of nerve and muscle. This enzyme is composed of two subunits, a large catalytic subunit (alpha) and a smaller glycoprotein subunit (beta). The catalytic subunit of Na+/K+ -ATPase is encoded by multiple genes. This gene encodes an alpha 1 subunit. Multiple transcript variants encoding different isoforms have been found for this gene. | ATPase Na+/K+ transporting subunit alpha 1 | ATP1A1 | ENSG00000163399 |
| 7056 | The protein encoded by this intronless gene is an endothelial-specific type I membrane receptor that binds thrombin. This binding results in the activation of protein C, which degrades clotting factors Va and VIIIa and reduces the amount of thrombin generated. Mutations in this gene are a cause of thromboembolic disease, also known as inherited thrombophilia. | thrombomodulin | THBD | ENSG00000178726 |
| 70 | Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC). | actin, alpha, cardiac muscle 1 | ACTC1 | ENSG00000159251 |
| 388533 | This gene encodes a protein which may function in the regulation of keratinocyte differentiation and maintenance of stratified epithelia. Multiple transcript variants encoding different isoforms have been found for this gene. | keratinocyte differentiation associated protein | KRTDAP | ENSG00000188508 |
| 10653 | This gene encodes a transmembrane protein with two extracellular Kunitz domains that inhibits a variety of serine proteases. The protein inhibits HGF activator which prevents the formation of active hepatocyte growth factor. This gene is a putative tumor suppressor, and mutations in this gene result in congenital sodium diarrhea. Multiple transcript variants encoding different isoforms have been found for this gene. | serine peptidase inhibitor, Kunitz type, 2 | SPINT2 | ENSG00000167642 |
| 2335 | This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | fibronectin 1 | FN1 | ENSG00000115414 |
| 6285 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21; however, this gene is located at 21q22.3. This protein may function in Neurite extension, proliferation of melanoma cells, stimulation of Ca2+ fluxes, inhibition of PKC-mediated phosphorylation, astrocytosis and axonal proliferation, and inhibition of microtubule assembly. Chromosomal rearrangements and altered expression of this gene have been implicated in several neurological, neoplastic, and other types of diseases, including Alzheimer’s disease, Down’s syndrome, epilepsy, amyotrophic lateral sclerosis, melanoma, and type I diabetes. | S100 calcium binding protein B | S100B | ENSG00000160307 |
| 8428 | This gene encodes a serine/threonine protein kinase that functions upstream of mitogen-activated protein kinase (MAPK) signaling. The encoded protein is cleaved into two chains by caspases; the N-terminal fragment (MST3/N) translocates to the nucleus and promotes programmed cells death. There is a pseudogene for this gene on chromosome X. Alternative splicing results in multiple transcript variants. | serine/threonine kinase 24 | STK24 | ENSG00000102572 |
| ENSG00000225630 | NA | mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 2 pseudogene 28 | MTND2P28 | ENSG00000225630 |
| 9314 | This gene encodes a protein that belongs to the Kruppel family of transcription factors. The encoded zinc finger protein is required for normal development of the barrier function of skin. The encoded protein is thought to control the G1-to-S transition of the cell cycle following DNA damage by mediating the tumor suppressor gene p53. Mice lacking this gene have a normal appearance but lose weight rapidly, and die shortly after birth due to fluid evaporation resulting from compromised epidermal barrier function. Alternative splicing results in multiple transcript variants encoding different isoforms. | Kruppel like factor 4 | KLF4 | ENSG00000136826 |
| 2688 | The protein encoded by this gene is a member of the somatotropin/prolactin family of hormones which play an important role in growth control. The gene, along with four other related genes, is located at the growth hormone locus on chromosome 17 where they are interspersed in the same transcriptional orientation; an arrangement which is thought to have evolved by a series of gene duplications. The five genes share a remarkably high degree of sequence identity. Alternative splicing generates additional isoforms of each of the five growth hormones, leading to further diversity and potential for specialization. This particular family member is expressed in the pituitary but not in placental tissue as is the case for the other four genes in the growth hormone locus. Mutations in or deletions of the gene lead to growth hormone deficiency and short stature. | growth hormone 1 | GH1 | ENSG00000259384 |
| 360 | This gene encodes the water channel protein aquaporin 3. Aquaporins are a family of small integral membrane proteins related to the major intrinsic protein, also known as aquaporin 0. Aquaporin 3 is localized at the basal lateral membranes of collecting duct cells in the kidney. In addition to its water channel function, aquaporin 3 has been found to facilitate the transport of nonionic small solutes such as urea and glycerol, but to a smaller degree. It has been suggested that water channels can be functionally heterogeneous and possess water and solute permeation mechanisms. Alternative splicing of this gene results in multiple transcript variants encoding different isoforms. | aquaporin 3 (Gill blood group) | AQP3 | ENSG00000165272 |
| 54739 | This gene encodes a protein which binds to and counteracts the inhibitory effect of a member of the IAP (inhibitor of apoptosis) protein family. IAP proteins bind to and inhibit caspases which are activated during apoptosis. The proportion of IAPs and proteins which interfere with their activity, such as the encoded protein, affect the progress of the apoptosis signaling pathway. Multiple transcript variants encoding different isoforms have been found for this gene. | XIAP associated factor 1 | XAF1 | ENSG00000132530 |
| 1191 | The protein encoded by this gene is a secreted chaperone that can under some stress conditions also be found in the cell cytosol. It has been suggested to be involved in several basic biological events such as cell death, tumor progression, and neurodegenerative disorders. Alternate splicing results in both coding and non-coding variants. | clusterin | CLU | ENSG00000120885 |
| 10135 | This gene encodes a protein that catalyzes the condensation of nicotinamide with 5-phosphoribosyl-1-pyrophosphate to yield nicotinamide mononucleotide, one step in the biosynthesis of nicotinamide adenine dinucleotide. The protein belongs to the nicotinic acid phosphoribosyltransferase (NAPRTase) family and is thought to be involved in many important biological processes, including metabolism, stress response and aging. This gene has a pseudogene on chromosome 10. | nicotinamide phosphoribosyltransferase | NAMPT | ENSG00000105835 |
| 220323 | NA | out at first homolog | OAF | ENSG00000184232 |
| 50649 | Rho GTPases play a fundamental role in numerous cellular processes that are initiated by extracellular stimuli that work through G protein coupled receptors. The protein encoded by this gene may form complex with G proteins and stimulate Rho-dependent signals. Multiple alternatively spliced transcript variants encoding different isoforms have been found, but the full-length nature of some variants has not been determined. | Rho guanine nucleotide exchange factor 4 | ARHGEF4 | ENSG00000136002 |
| 2266 | The protein encoded by this gene is the gamma component of fibrinogen, a blood-borne glycoprotein comprised of three pairs of nonidentical polypeptide chains. Following vascular injury, fibrinogen is cleaved by thrombin to form fibrin which is the most abundant component of blood clots. In addition, various cleavage products of fibrinogen and fibrin regulate cell adhesion and spreading, display vasoconstrictor and chemotactic activities, and are mitogens for several cell types. Mutations in this gene lead to several disorders, including dysfibrinogenemia, hypofibrinogenemia and thrombophilia. Alternative splicing results in transcript variants encoding different isoforms. | fibrinogen gamma chain | FGG | ENSG00000171557 |
| 5919 | This gene encodes a secreted chemotactic protein that initiates chemotaxis via the ChemR23 G protein-coupled seven-transmembrane domain ligand. Expression of this gene is upregulated by the synthetic retinoid tazarotene and occurs in a wide variety of tissues. The active protein has several roles, including that as an adipokine and as an antimicrobial protein with activity against bacteria and fungi. | retinoic acid receptor responder 2 | RARRES2 | ENSG00000106538 |
| 7534 | This gene product belongs to the 14-3-3 family of proteins which mediate signal transduction by binding to phosphoserine-containing proteins. This highly conserved protein family is found in both plants and mammals, and this protein is 99% identical to the mouse, rat and sheep orthologs. The encoded protein interacts with IRS1 protein, suggesting a role in regulating insulin sensitivity. Several transcript variants that differ in the 5’ UTR but that encode the same protein have been identified for this gene. | tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein zeta | YWHAZ | ENSG00000164924 |
| 1292 | This gene encodes one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The product of this gene contains several domains similar to von Willebrand Factor type A domains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in this gene are associated with Bethlem myopathy and Ullrich scleroatonic muscular dystrophy. Three transcript variants have been identified for this gene. | collagen type VI alpha 2 | COL6A2 | ENSG00000142173 |
| 134147 | CMBL (EC 3.1.1.45) is a cysteine hydrolase of the dienelactone hydrolase family that is highly expressed in liver cytosol. CMBL preferentially cleaves cyclic esters, and it activates medoxomil-ester prodrugs in which the medoxomil moiety is linked to an oxygen atom (Ishizuka et al., 2010 [PubMed 20177059]). | carboxymethylenebutenolidase homolog (Pseudomonas) | CMBL | ENSG00000164237 |
| 5004 | This gene encodes a key acute phase plasma protein. Because of its increase due to acute inflammation, this protein is classified as an acute-phase reactant. The specific function of this protein has not yet been determined; however, it may be involved in aspects of immunosuppression. | orosomucoid 1 | ORM1 | ENSG00000229314 |
| 57402 | This gene encodes a member of the S100 protein family which contains an EF-hand motif and binds calcium. The gene is located in a cluster of S100 genes on chromosome 1. Levels of the encoded protein have been found to be lower in cancerous tissue and associated with metastasis suggesting a tumor suppressor function (PMID: 19956863, 19351828). | S100 calcium binding protein A14 | S100A14 | ENSG00000189334 |
| 488 | This gene encodes one of the SERCA Ca(2+)-ATPases, which are intracellular pumps located in the sarcoplasmic or endoplasmic reticula of muscle cells. This enzyme catalyzes the hydrolysis of ATP coupled with the translocation of calcium from the cytosol into the sarcoplasmic reticulum lumen, and is involved in regulation of the contraction/relaxation cycle. Mutations in this gene cause Darier-White disease, also known as keratosis follicularis, an autosomal dominant skin disorder characterized by loss of adhesion between epidermal cells and abnormal keratinization. Alternative splicing results in multiple transcript variants encoding different isoforms. | ATPase sarcoplasmic/endoplasmic reticulum Ca2+ transporting 2 | ATP2A2 | ENSG00000174437 |
| ENSG00000180139 | NA | ACTA2 antisense RNA 1 | ACTA2-AS1 | ENSG00000180139 |
| 3512 | NA | joining chain of multimeric IgA and IgM | JCHAIN | ENSG00000132465 |
| 9620 | The protein encoded by this gene is a member of the flamingo subfamily, part of the cadherin superfamily. The flamingo subfamily consists of nonclassic-type cadherins; a subpopulation that does not interact with catenins. The flamingo cadherins are located at the plasma membrane and have nine cadherin domains, seven epidermal growth factor-like repeats and two laminin A G-type repeats in their ectodomain. They also have seven transmembrane domains, a characteristic unique to this subfamily. It is postulated that these proteins are receptors involved in contact-mediated communication, with cadherin domains acting as homophilic binding regions and the EGF-like domains involved in cell adhesion and receptor-ligand interactions. This particular member is a developmentally regulated, neural-specific gene which plays an unspecified role in early embryogenesis. | cadherin EGF LAG seven-pass G-type receptor 1 | CELSR1 | ENSG00000075275 |
| 10057 | The protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intra-cellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the MRP subfamily which is involved in multi-drug resistance. This protein functions in the cellular export of its substrate, cyclic nucleotides. This export contributes to the degradation of phosphodiesterases and possibly an elimination pathway for cyclic nucleotides. Studies show that this protein provides resistance to thiopurine anticancer drugs, 6-mercatopurine and thioguanine, and the anti-HIV drug 9-(2-phosphonylmethoxyethyl)adenine. This protein may be involved in resistance to thiopurines in acute lymphoblastic leukemia and antiretroviral nucleoside analogs in HIV-infected patients. Alternative splicing results in multiple transcript variants. | ATP binding cassette subfamily C member 5 | ABCC5 | ENSG00000114770 |
| 5950 | This protein belongs to the lipocalin family and is the specific carrier for retinol (vitamin A alcohol) in the blood. It delivers retinol from the liver stores to the peripheral tissues. In plasma, the RBP-retinol complex interacts with transthyretin which prevents its loss by filtration through the kidney glomeruli. A deficiency of vitamin A blocks secretion of the binding protein posttranslationally and results in defective delivery and supply to the epidermal cells. | retinol binding protein 4 | RBP4 | ENSG00000138207 |
| 149428 | The protein encoded by this gene interacts with several other proteins, such as BCL2, ARHGAP1, MIF and GFER. It may function as a bridge molecule between BCL2 and ARHGAP1/CDC42 in promoting cell death. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | BCL2/adenovirus E1B 19kD interacting protein like | BNIPL | ENSG00000163141 |
| 2244 | The protein encoded by this gene is the beta component of fibrinogen, a blood-borne glycoprotein comprised of three pairs of nonidentical polypeptide chains. Following vascular injury, fibrinogen is cleaved by thrombin to form fibrin which is the most abundant component of blood clots. In addition, various cleavage products of fibrinogen and fibrin regulate cell adhesion and spreading, display vasoconstrictor and chemotactic activities, and are mitogens for several cell types. Mutations in this gene lead to several disorders, including afibrinogenemia, dysfibrinogenemia, hypodysfibrinogenemia and thrombotic tendency. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | fibrinogen beta chain | FGB | ENSG00000171564 |
| 1360 | Three different procarboxypeptidases A and two different procarboxypeptidases B have been isolated. The B1 and B2 forms differ from each other mainly in isoelectric point. Carboxypeptidase B1 is a highly tissue-specific protein and is a useful serum marker for acute pancreatitis and dysfunction of pancreatic transplants. It is not elevated in pancreatic carcinoma. | carboxypeptidase B1 | CPB1 | ENSG00000153002 |
| 25900 | This gene is a member of the intermediate filament family. Intermediate filaments are proteins which are primordial components of the cytoskeleton and nuclear envelope. The proteins encoded by the members of this gene family are evolutionarily and structurally related but have limited sequence homology, with the exception of the central rod domain. Multiple alternatively spliced transcript variants encoding different isoforms have been found for this gene. | intermediate filament family orphan 1 | IFFO1 | ENSG00000010295 |
| 23344 | NA | extended synaptotagmin protein 1 | ESYT1 | ENSG00000139641 |
| 58498 | NA | myosin light chain 7 | MYL7 | ENSG00000106631 |
| 29842 | NA | transcription factor CP2-like 1 | TFCP2L1 | ENSG00000115112 |
| ENSG00000261054 | NA | NA | RP11-6O2.4 | ENSG00000261054 |
| 2052 | Epoxide hydrolase is a critical biotransformation enzyme that converts epoxides from the degradation of aromatic compounds to trans-dihydrodiols which can be conjugated and excreted from the body. Epoxide hydrolase functions in both the activation and detoxification of epoxides. Mutations in this gene cause preeclampsia, epoxide hydrolase deficiency or increased epoxide hydrolase activity. Alternatively spliced transcript variants encoding the same protein have been found for this gene. | epoxide hydrolase 1 | EPHX1 | ENSG00000143819 |
| 4629 | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | myosin, heavy chain 11, smooth muscle | MYH11 | ENSG00000133392 |
| ENSG00000229732 | NA | NA | AC019349.5 | ENSG00000229732 |
| 8766 | The protein encoded by this gene belongs to the Rab family of the small GTPase superfamily. It is associated with both constitutive and regulated secretory pathways, and may be involved in protein transport. Two transcript variants encoding different isoforms have been found for this gene. | RAB11A, member RAS oncogene family | RAB11A | ENSG00000103769 |
| 84649 | This gene encodes one of two enzymes which catalyzes the final reaction in the synthesis of triglycerides in which diacylglycerol is covalently bound to long chain fatty acyl-CoAs. The encoded protein catalyzes this reaction at low concentrations of magnesium chloride while the other enzyme has high activity at high concentrations of magnesium chloride. Multiple transcript variants encoding different isoforms have been found for this gene. | diacylglycerol O-acyltransferase 2 | DGAT2 | ENSG00000062282 |
| 229 | Fructose-1,6-bisphosphate aldolase (EC 4.1.2.13) is a tetrameric glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Vertebrates have 3 aldolase isozymes which are distinguished by their electrophoretic and catalytic properties. Differences indicate that aldolases A, B, and C are distinct proteins, the products of a family of related ‘housekeeping’ genes exhibiting developmentally regulated expression of the different isozymes. The developing embryo produces aldolase A, which is produced in even greater amounts in adult muscle where it can be as much as 5% of total cellular protein. In adult liver, kidney and intestine, aldolase A expression is repressed and aldolase B is produced. In brain and other nervous tissue, aldolase A and C are expressed about equally. There is a high degree of homology between aldolase A and C. Defects in ALDOB cause hereditary fructose intolerance. | aldolase, fructose-bisphosphate B | ALDOB | ENSG00000136872 |
| 7448 | The protein encoded by this gene is a member of the pexin family. It is found in serum and tissues and promotes cell adhesion and spreading, inhibits the membrane-damaging effect of the terminal cytolytic complement pathway, and binds to several serpin serine protease inhibitors. It is a secreted protein and exists in either a single chain form or a clipped, two chain form held together by a disulfide bond. | vitronectin | VTN | ENSG00000109072 |
| 100528017 | This locus represents naturally occurring read-through transcription between the neighboring serum amyloid A2 and serum amyloid A4 genes on chromosome 11. The read-through transcript produces a fusion protein that shares sequence identity with each individual gene product. | SAA2-SAA4 readthrough | SAA2-SAA4 | ENSG00000255071 |
| 5208 | The protein encoded by this gene is involved in both the synthesis and degradation of fructose-2,6-bisphosphate, a regulatory molecule that controls glycolysis in eukaryotes. The encoded protein has a 6-phosphofructo-2-kinase activity that catalyzes the synthesis of fructose-2,6-bisphosphate, and a fructose-2,6-biphosphatase activity that catalyzes the degradation of fructose-2,6-bisphosphate. This protein regulates fructose-2,6-bisphosphate levels in the heart, while a related enzyme encoded by a different gene regulates fructose-2,6-bisphosphate levels in the liver and muscle. This enzyme functions as a homodimer. Two transcript variants encoding two different isoforms have been found for this gene. | 6-phosphofructo-2-kinase/fructose-2,6-biphosphatase 2 | PFKFB2 | ENSG00000123836 |
| 1264 | NA | calponin 1 | CNN1 | ENSG00000130176 |
| 1281 | This gene encodes the pro-alpha1 chains of type III collagen, a fibrillar collagen that is found in extensible connective tissues such as skin, lung, uterus, intestine and the vascular system, frequently in association with type I collagen. Mutations in this gene are associated with Ehlers-Danlos syndrome types IV, and with aortic and arterial aneurysms. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | collagen type III alpha 1 chain | COL3A1 | ENSG00000168542 |
| 23022 | This gene encodes a cytoskeletal protein that is required for organizing the actin cytoskeleton. The protein is a component of actin-containing microfilaments, and it is involved in the control of cell shape, adhesion, and contraction. Polymorphisms in this gene are associated with a susceptibility to pancreatic cancer type 1, and also with a risk for myocardial infarction. Alternative splicing results in multiple transcript variants. | palladin, cytoskeletal associated protein | PALLD | ENSG00000129116 |
| 7538 | NA | ZFP36 ring finger protein | ZFP36 | ENSG00000128016 |
| 3849 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | keratin 2 | KRT2 | ENSG00000172867 |
| 5187 | This gene is a member of the Period family of genes and is expressed in a circadian pattern in the suprachiasmatic nucleus, the primary circadian pacemaker in the mammalian brain. Genes in this family encode components of the circadian rhythms of locomotor activity, metabolism, and behavior. This gene is upregulated by CLOCK/ARNTL heterodimers but then represses this upregulation in a feedback loop using PER/CRY heterodimers to interact with CLOCK/ARNTL. Polymorphisms in this gene may increase the risk of getting certain cancers. Alternative splicing has been observed in this gene; however, these variants have not been fully described. | period circadian clock 1 | PER1 | ENSG00000179094 |
| 7173 | This gene encodes a membrane-bound glycoprotein. The encoded protein acts as an enzyme and plays a central role in thyroid gland function. The protein functions in the iodination of tyrosine residues in thyroglobulin and phenoxy-ester formation between pairs of iodinated tyrosines to generate the thyroid hormones, thyroxine and triiodothyronine. Mutations in this gene are associated with several disorders of thyroid hormonogenesis, including congenital hypothyroidism, congenital goiter, and thyroid hormone organification defect IIA. Multiple transcript variants encoding distinct isoforms have been identified for this gene, but the full-length nature of some variants has not been determined. | thyroid peroxidase | TPO | ENSG00000115705 |
| 4311 | This gene encodes a common acute lymphocytic leukemia antigen that is an important cell surface marker in the diagnosis of human acute lymphocytic leukemia (ALL). This protein is present on leukemic cells of pre-B phenotype, which represent 85% of cases of ALL. This protein is not restricted to leukemic cells, however, and is found on a variety of normal tissues. It is a glycoprotein that is particularly abundant in kidney, where it is present on the brush border of proximal tubules and on glomerular epithelium. The protein is a neutral endopeptidase that cleaves peptides at the amino side of hydrophobic residues and inactivates several peptide hormones including glucagon, enkephalins, substance P, neurotensin, oxytocin, and bradykinin. This gene, which encodes a 100-kD type II transmembrane glycoprotein, exists in a single copy of greater than 45 kb. The 5’ untranslated region of this gene is alternatively spliced, resulting in four separate mRNA transcripts. The coding region is not affected by alternative splicing. | membrane metallo-endopeptidase | MME | ENSG00000196549 |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",8,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[9,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| X_id | name | summary | symbol | query | notfound |
|---|---|---|---|---|---|
| 1674 | desmin | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | DES | ENSG00000175084 | NA |
| 4629 | myosin, heavy chain 11, smooth muscle | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | MYH11 | ENSG00000133392 | NA |
| 2335 | fibronectin 1 | This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | FN1 | ENSG00000115414 | NA |
| 213 | albumin | Albumin is a soluble, monomeric protein which comprises about one-half of the blood serum protein. Albumin functions primarily as a carrier protein for steroids, fatty acids, and thyroid hormones and plays a role in stabilizing extracellular fluid volume. Albumin is a globular unglycosylated serum protein of molecular weight 65,000. Albumin is synthesized in the liver as preproalbumin which has an N-terminal peptide that is removed before the nascent protein is released from the rough endoplasmic reticulum. The product, proalbumin, is in turn cleaved in the Golgi vesicles to produce the secreted albumin. | ALB | ENSG00000163631 | NA |
| 60 | actin, beta | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | ACTB | ENSG00000075624 | NA |
| 10398 | myosin light chain 9 | Myosin, a structural component of muscle, consists of two heavy chains and four light chains. The protein encoded by this gene is a myosin light chain that may regulate muscle contraction by modulating the ATPase activity of myosin heads. The encoded protein binds calcium and is activated by myosin light chain kinase. Two transcript variants encoding different isoforms have been found for this gene. | MYL9 | ENSG00000101335 | NA |
| 5265 | serpin family A member 1 | The protein encoded by this gene is secreted and is a serine protease inhibitor whose targets include elastase, plasmin, thrombin, trypsin, chymotrypsin, and plasminogen activator. Defects in this gene can cause emphysema or liver disease. Several transcript variants encoding the same protein have been found for this gene. | SERPINA1 | ENSG00000197249 | NA |
| 2934 | gelsolin | The protein encoded by this gene binds to the ‘plus’ ends of actin monomers and filaments to prevent monomer exchange. The encoded calcium-regulated protein functions in both assembly and disassembly of actin filaments. Defects in this gene are a cause of familial amyloidosis Finnish type (FAF). Multiple transcript variants encoding several different isoforms have been found for this gene. | GSN | ENSG00000148180 | NA |
| 2243 | fibrinogen alpha chain | This gene encodes the alpha subunit of the coagulation factor fibrinogen, which is a component of the blood clot. Following vascular injury, the encoded preproprotein is proteolytically processed by thrombin during the conversion of fibrinogen to fibrin. Mutations in this gene lead to several disorders, including dysfibrinogenemia, hypofibrinogenemia, afibrinogenemia and renal amyloidosis. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. | FGA | ENSG00000171560 | NA |
| 2244 | fibrinogen beta chain | The protein encoded by this gene is the beta component of fibrinogen, a blood-borne glycoprotein comprised of three pairs of nonidentical polypeptide chains. Following vascular injury, fibrinogen is cleaved by thrombin to form fibrin which is the most abundant component of blood clots. In addition, various cleavage products of fibrinogen and fibrin regulate cell adhesion and spreading, display vasoconstrictor and chemotactic activities, and are mitogens for several cell types. Mutations in this gene lead to several disorders, including afibrinogenemia, dysfibrinogenemia, hypodysfibrinogenemia and thrombotic tendency. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | FGB | ENSG00000171564 | NA |
| 7431 | vimentin | This gene encodes a member of the intermediate filament family. Intermediate filamentents, along with microtubules and actin microfilaments, make up the cytoskeleton. The protein encoded by this gene is responsible for maintaining cell shape, integrity of the cytoplasm, and stabilizing cytoskeletal interactions. It is also involved in the immune response, and controls the transport of low-density lipoprotein (LDL)-derived cholesterol from a lysosome to the site of esterification. It functions as an organizer of a number of critical proteins involved in attachment, migration, and cell signaling. Mutations in this gene causes a dominant, pulverulent cataract. | VIM | ENSG00000026025 | NA |
| NA | NA | NA | NA | ENSG00000259716 | TRUE |
| 1465 | cysteine and glycine rich protein 1 | This gene encodes a member of the cysteine-rich protein (CSRP) family. This gene family includes a group of LIM domain proteins, which may be involved in regulatory processes important for development and cellular differentiation. The LIM/double zinc-finger motif found in this gene product occurs in proteins with critical functions in gene regulation, cell growth, and somatic differentiation. Alternatively spliced transcript variants have been described. | CSRP1 | ENSG00000159176 | NA |
| 4151 | myoglobin | This gene encodes a member of the globin superfamily and is expressed in skeletal and cardiac muscles. The encoded protein is a haemoprotein contributing to intracellular oxygen storage and transcellular facilitated diffusion of oxygen. At least three alternatively spliced transcript variants encoding the same protein have been reported. | MB | ENSG00000198125 | NA |
| 4625 | myosin, heavy chain 7, cardiac muscle, beta | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | MYH7 | ENSG00000092054 | NA |
| 2266 | fibrinogen gamma chain | The protein encoded by this gene is the gamma component of fibrinogen, a blood-borne glycoprotein comprised of three pairs of nonidentical polypeptide chains. Following vascular injury, fibrinogen is cleaved by thrombin to form fibrin which is the most abundant component of blood clots. In addition, various cleavage products of fibrinogen and fibrin regulate cell adhesion and spreading, display vasoconstrictor and chemotactic activities, and are mitogens for several cell types. Mutations in this gene lead to several disorders, including dysfibrinogenemia, hypofibrinogenemia and thrombophilia. Alternative splicing results in transcript variants encoding different isoforms. | FGG | ENSG00000171557 | NA |
| 5004 | orosomucoid 1 | This gene encodes a key acute phase plasma protein. Because of its increase due to acute inflammation, this protein is classified as an acute-phase reactant. The specific function of this protein has not yet been determined; however, it may be involved in aspects of immunosuppression. | ORM1 | ENSG00000229314 | NA |
| 4638 | myosin light chain kinase | This gene, a muscle member of the immunoglobulin gene superfamily, encodes myosin light chain kinase which is a calcium/calmodulin dependent enzyme. This kinase phosphorylates myosin regulatory light chains to facilitate myosin interaction with actin filaments to produce contractile activity. This gene encodes both smooth muscle and nonmuscle isoforms. In addition, using a separate promoter in an intron in the 3’ region, it encodes telokin, a small protein identical in sequence to the C-terminus of myosin light chain kinase, that is independently expressed in smooth muscle and functions to stabilize unphosphorylated myosin filaments. A pseudogene is located on the p arm of chromosome 3. Four transcript variants that produce four isoforms of the calcium/calmodulin dependent enzyme have been identified as well as two transcripts that produce two isoforms of telokin. Additional variants have been identified but lack full length transcripts. | MYLK | ENSG00000065534 | NA |
| 7038 | thyroglobulin | Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. | TG | ENSG00000042832 | NA |
| 4633 | myosin light chain 2 | Thus gene encodes the regulatory light chain associated with cardiac myosin beta (or slow) heavy chain. Ca+ triggers the phosphorylation of regulatory light chain that in turn triggers contraction. Mutations in this gene are associated with mid-left ventricular chamber type hypertrophic cardiomyopathy. | MYL2 | ENSG00000111245 | NA |
| 2199 | fibulin 2 | This gene encodes an extracellular matrix protein, which belongs to the fibulin family. This protein binds various extracellular ligands and calcium. It may play a role during organ development, in particular, during the differentiation of heart, skeletal and neuronal structures. Alternatively spliced transcript variants encoding different isoforms have been identified. | FBLN2 | ENSG00000163520 | NA |
| 1401 | C-reactive protein, pentraxin-related | The protein encoded by this gene belongs to the pentaxin family. It is involved in several host defense related functions based on its ability to recognize foreign pathogens and damaged cells of the host and to initiate their elimination by interacting with humoral and cellular effector systems in the blood. Consequently, the level of this protein in plasma increases greatly during acute phase response to tissue injury, infection, or other inflammatory stimuli. | CRP | ENSG00000132693 | NA |
| 151887 | coiled-coil domain containing 80 | NA | CCDC80 | ENSG00000091986 | NA |
| 4023 | lipoprotein lipase | LPL encodes lipoprotein lipase, which is expressed in heart, muscle, and adipose tissue. LPL functions as a homodimer, and has the dual functions of triglyceride hydrolase and ligand/bridging factor for receptor-mediated lipoprotein uptake. Severe mutations that cause LPL deficiency result in type I hyperlipoproteinemia, while less extreme mutations in LPL are linked to many disorders of lipoprotein metabolism. | LPL | ENSG00000175445 | NA |
| 87 | actinin alpha 1 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a nonmuscle, cytoskeletal, alpha actinin isoform and maps to the same site as the structurally similar erythroid beta spectrin gene. Three transcript variants encoding different isoforms have been found for this gene. | ACTN1 | ENSG00000072110 | NA |
| 302 | annexin A2 | This gene encodes a member of the annexin family. Members of this calcium-dependent phospholipid-binding protein family play a role in the regulation of cellular growth and in signal transduction pathways. This protein functions as an autocrine factor which heightens osteoclast formation and bone resorption. This gene has three pseudogenes located on chromosomes 4, 9 and 10, respectively. Multiple alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ANXA2 | ENSG00000182718 | NA |
| 7145 | tensin 1 | The protein encoded by this gene localizes to focal adhesions, regions of the plasma membrane where the cell attaches to the extracellular matrix. This protein crosslinks actin filaments and contains a Src homology 2 (SH2) domain, which is often found in molecules involved in signal transduction. This protein is a substrate of calpain II. Alternative splicing results in multiple transcript variants encoding different isoforms. | TNS1 | ENSG00000079308 | NA |
| 1571 | cytochrome P450 family 2 subfamily E member 1 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum and is induced by ethanol, the diabetic state, and starvation. The enzyme metabolizes both endogenous substrates, such as ethanol, acetone, and acetal, as well as exogenous substrates including benzene, carbon tetrachloride, ethylene glycol, and nitrosamines which are premutagens found in cigarette smoke. Due to its many substrates, this enzyme may be involved in such varied processes as gluconeogenesis, hepatic cirrhosis, diabetes, and cancer. | CYP2E1 | ENSG00000130649 | NA |
| 27063 | ankyrin repeat domain 1 | The protein encoded by this gene is localized to the nucleus of endothelial cells and is induced by IL-1 and TNF-alpha stimulation. Studies in rat cardiomyocytes suggest that this gene functions as a transcription factor. Interactions between this protein and the sarcomeric proteins myopalladin and titin suggest that it may also be involved in the myofibrillar stretch-sensor system. | ANKRD1 | ENSG00000148677 | NA |
| 9445 | integral membrane protein 2B | Amyloid precursor proteins are processed by beta-secretase and gamma-secretase to produce beta-amyloid peptides which form the characteristic plaques of Alzheimer disease. This gene encodes a transmembrane protein which is processed at the C-terminus by furin or furin-like proteases to produce a small secreted peptide which inhibits the deposition of beta-amyloid. Mutations which result in extension of the C-terminal end of the encoded protein, thereby increasing the size of the secreted peptide, are associated with two neurogenerative diseases, familial British dementia and familial Danish dementia. | ITM2B | ENSG00000136156 | NA |
| 345 | apolipoprotein C3 | Apolipoprotein C-III is a very low density lipoprotein (VLDL) protein. APOC3 inhibits lipoprotein lipase and hepatic lipase; it is thought to delay catabolism of triglyceride-rich particles. The APOA1, APOC3 and APOA4 genes are closely linked in both rat and human genomes. The A-I and A-IV genes are transcribed from the same strand, while the A-1 and C-III genes are convergently transcribed. An increase in apoC-III levels induces the development of hypertriglyceridemia. | APOC3 | ENSG00000110245 | NA |
| 1278 | collagen type I alpha 2 chain | This gene encodes the pro-alpha2 chain of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIB, recessive Ehlers-Danlos syndrome Classical type, idiopathic osteoporosis, and atypical Marfan syndrome. Symptoms associated with mutations in this gene, however, tend to be less severe than mutations in the gene for the alpha1 chain of type I collagen (COL1A1) reflecting the different role of alpha2 chains in matrix integrity. Three transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | COL1A2 | ENSG00000164692 | NA |
| 567 | beta-2-microglobulin | This gene encodes a serum protein found in association with the major histocompatibility complex (MHC) class I heavy chain on the surface of nearly all nucleated cells. The protein has a predominantly beta-pleated sheet structure that can form amyloid fibrils in some pathological conditions. The encoded antimicrobial protein displays antibacterial activity in amniotic fluid. A mutation in this gene has been shown to result in hypercatabolic hypoproteinemia. | B2M | ENSG00000166710 | NA |
| 4634 | myosin light chain 3 | MYL3 encodes myosin light chain 3, an alkali light chain also referred to in the literature as both the ventricular isoform and the slow skeletal muscle isoform. Mutations in MYL3 have been identified as a cause of mid-left ventricular chamber type hypertrophic cardiomyopathy. | MYL3 | ENSG00000160808 | NA |
| 1410 | crystallin alpha B | Mammalian lens crystallins are divided into alpha, beta, and gamma families. Alpha crystallins are composed of two gene products: alpha-A and alpha-B, for acidic and basic, respectively. Alpha crystallins can be induced by heat shock and are members of the small heat shock protein (HSP20) family. They act as molecular chaperones although they do not renature proteins and release them in the fashion of a true chaperone; instead they hold them in large soluble aggregates. Post-translational modifications decrease the ability to chaperone. These heterogeneous aggregates consist of 30-40 subunits; the alpha-A and alpha-B subunits have a 3:1 ratio, respectively. Two additional functions of alpha crystallins are an autokinase activity and participation in the intracellular architecture. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. Alpha-A and alpha-B gene products are differentially expressed; alpha-A is preferentially restricted to the lens and alpha-B is expressed widely in many tissues and organs. Elevated expression of alpha-B crystallin occurs in many neurological diseases; a missense mutation cosegregated in a family with a desmin-related myopathy. Alternative splicing results in multiple transcript variants. | CRYAB | ENSG00000109846 | NA |
| 5730 | prostaglandin D2 synthase | The protein encoded by this gene is a glutathione-independent prostaglandin D synthase that catalyzes the conversion of prostaglandin H2 (PGH2) to postaglandin D2 (PGD2). PGD2 functions as a neuromodulator as well as a trophic factor in the central nervous system. PGD2 is also involved in smooth muscle contraction/relaxation and is a potent inhibitor of platelet aggregation. This gene is preferentially expressed in brain. Studies with transgenic mice overexpressing this gene suggest that this gene may be also involved in the regulation of non-rapid eye movement sleep. | PTGDS | ENSG00000107317 | NA |
| 350 | apolipoprotein H | Apolipoprotein H has been implicated in a variety of physiologic pathways including lipoprotein metabolism, coagulation, and the production of antiphospholipid autoantibodies. APOH may be a required cofactor for anionic phospholipid binding by the antiphospholipid autoantibodies found in sera of many patients with lupus and primary antiphospholipid syndrome, but it does not seem to be required for the reactivity of antiphospholipid autoantibodies associated with infections. | APOH | ENSG00000091583 | NA |
| 1293 | collagen type VI alpha 3 chain | This gene encodes the alpha-3 chain, one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The alpha-3 chain of type VI collagen is much larger than the alpha-1 and -2 chains. This difference in size is largely due to an increase in the number of subdomains, similar to von Willebrand Factor type A domains, that are found in the amino terminal globular domain of all the alpha chains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in the type VI collagen genes are associated with Bethlem myopathy, a rare autosomal dominant proximal myopathy with early childhood onset. Mutations in this gene are also a cause of Ullrich congenital muscular dystrophy, also referred to as Ullrich scleroatonic muscular dystrophy, an autosomal recessive congenital myopathy that is more severe than Bethlem myopathy. Multiple transcript variants have been identified, but the full-length nature of only some of these variants has been described. | COL6A3 | ENSG00000163359 | NA |
| 2012 | epithelial membrane protein 1 | NA | EMP1 | ENSG00000134531 | NA |
| 23413 | neuronal calcium sensor 1 | This gene is a member of the neuronal calcium sensor gene family, which encode calcium-binding proteins expressed predominantly in neurons. The protein encoded by this gene regulates G protein-coupled receptor phosphorylation in a calcium-dependent manner and can substitute for calmodulin. The protein is associated with secretory granules and modulates synaptic transmission and synaptic plasticity. Multiple transcript variants encoding different isoforms have been found for this gene. | NCS1 | ENSG00000107130 | NA |
| 7169 | tropomyosin 2 (beta) | This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | TPM2 | ENSG00000198467 | NA |
| 259 | alpha-1-microglobulin/bikunin precursor | This gene encodes a complex glycoprotein secreted in plasma. The precursor is proteolytically processed into distinct functioning proteins: alpha-1-microglobulin, which belongs to the superfamily of lipocalin transport proteins and may play a role in the regulation of inflammatory processes, and bikunin, which is a urinary trypsin inhibitor belonging to the superfamily of Kunitz-type protease inhibitors and plays an important role in many physiological and pathological processes. This gene is located on chromosome 9 in a cluster of lipocalin genes. | AMBP | ENSG00000106927 | NA |
| 7134 | troponin C1, slow skeletal and cardiac type | Troponin is a central regulatory protein of striated muscle contraction, and together with tropomyosin, is located on the actin filament. Troponin consists of 3 subunits: TnI, which is the inhibitor of actomyosin ATPase; TnT, which contains the binding site for tropomyosin; and TnC, the protein encoded by this gene. The binding of calcium to TnC abolishes the inhibitory action of TnI, thus allowing the interaction of actin with myosin, the hydrolysis of ATP, and the generation of tension. Mutations in this gene are associated with cardiomyopathy dilated type 1Z. | TNNC1 | ENSG00000114854 | NA |
| 3678 | integrin subunit alpha 5 | The product of this gene belongs to the integrin alpha chain family. Integrins are heterodimeric integral membrane proteins composed of an alpha subunit and a beta subunit that function in cell surface adhesion and signaling. The encoded preproprotein is proteolytically processed to generate light and heavy chains that comprise the alpha 5 subunit. This subunit associates with the beta 1 subunit to form a fibronectin receptor. This integrin may promote tumor invasion, and higher expression of this gene may be correlated with shorter survival time in lung cancer patients. Note that the integrin alpha 5 and integrin alpha V subunits are encoded by distinct genes. | ITGA5 | ENSG00000161638 | NA |
| 3263 | hemopexin | This gene encodes a plasma glycoprotein that binds heme with high affinity. The encoded protein is an acute phase protein that transports heme from the plasma to the liver and may be involved in protecting cells from oxidative stress. | HPX | ENSG00000110169 | NA |
| 88 | actinin alpha 2 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a muscle-specific, alpha actinin isoform that is expressed in both skeletal and cardiac muscles. Several transcript variants encoding different isoforms have been found for this gene. | ACTN2 | ENSG00000077522 | NA |
| 10627 | myosin light chain 12A | This gene encodes a nonsarcomeric myosin regulatory light chain. This protein is activated by phosphorylation and regulates smooth muscle and non-muscle cell contraction. This protein may also be involved in DNA damage repair by sequestering the transcriptional regulator apoptosis-antagonizing transcription factor (AATF)/Che-1 which functions as a repressor of p53-driven apoptosis. Alternate splicing results in multiple transcript variants. A pseudogene of this gene is found on chromosome 8. | MYL12A | ENSG00000101608 | NA |
| 1357 | carboxypeptidase A1 | This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. | CPA1 | ENSG00000091704 | NA |
| 7273 | titin | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | TTN | ENSG00000155657 | NA |
| 8076 | microfibrillar associated protein 5 | This gene encodes a 25-kD microfibril-associated glycoprotein which is a component of microfibrils of the extracellular matrix. The encoded protein promotes attachment of cells to microfibrils via alpha-V-beta-3 integrin. Deficiency of this gene in mice results in neutropenia. Alternate splicing results in multiple transcript variants encoding different isoforms. | MFAP5 | ENSG00000197614 | NA |
| NA | NA | NA | NA | ENSG00000272761 | TRUE |
| 1264 | calponin 1 | NA | CNN1 | ENSG00000130176 | NA |
| 3043 | hemoglobin subunit beta | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | HBB | ENSG00000244734 | NA |
| 2022 | endoglin | This gene encodes a homodimeric transmembrane protein which is a major glycoprotein of the vascular endothelium. This protein is a component of the transforming growth factor beta receptor complex and it binds to the beta1 and beta3 peptides with high affinity. Mutations in this gene cause hereditary hemorrhagic telangiectasia, also known as Osler-Rendu-Weber syndrome 1, an autosomal dominant multisystemic vascular dysplasia. This gene may also be involved in preeclampsia and several types of cancer. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ENG | ENSG00000106991 | NA |
| 4607 | myosin binding protein C, cardiac | MYBPC3 encodes the cardiac isoform of myosin-binding protein C. Myosin-binding protein C is a myosin-associated protein found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. MYBPC3, the cardiac isoform, is expressed exclussively in heart muscle. Regulatory phosphorylation of the cardiac isoform in vivo by cAMP-dependent protein kinase (PKA) upon adrenergic stimulation may be linked to modulation of cardiac contraction. Mutations in MYBPC3 are one cause of familial hypertrophic cardiomyopathy. | MYBPC3 | ENSG00000134571 | NA |
| 6711 | spectrin beta, non-erythrocytic 1 | Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. This gene is one member of a family of beta-spectrin genes. The encoded protein contains an N-terminal actin-binding domain, and 17 spectrin repeats which are involved in dimer formation. Multiple transcript variants encoding different isoforms have been found for this gene. | SPTBN1 | ENSG00000115306 | NA |
| 4015 | lysyl oxidase | This gene encodes a member of the lysyl oxidase family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate a regulatory propeptide and the mature enzyme. The copper-dependent amine oxidase activity of this enzyme functions in the crosslinking of collagens and elastin, while the propeptide may play a role in tumor suppression. | LOX | ENSG00000113083 | NA |
| 8048 | cysteine and glycine rich protein 3 | This gene encodes a member of the CSRP family of LIM domain proteins, which may be involved in regulatory processes important for development and cellular differentiation. The LIM/double zinc-finger motif found in this protein is found in a group of proteins with critical functions in gene regulation, cell growth, and somatic differentiation. Mutations in this gene are thought to cause heritable forms of hypertrophic cardiomyopathy (HCM) and dilated cardiomyopathy (DCM) in humans. Alternatively spliced transcript variants with different 5’ UTR, but encoding the same protein, have been found for this gene. | CSRP3 | ENSG00000129170 | NA |
| 1303 | collagen type XII alpha 1 chain | This gene encodes the alpha chain of type XII collagen, a member of the FACIT (fibril-associated collagens with interrupted triple helices) collagen family. Type XII collagen is a homotrimer found in association with type I collagen, an association that is thought to modify the interactions between collagen I fibrils and the surrounding matrix. Alternatively spliced transcript variants encoding different isoforms have been identified. | COL12A1 | ENSG00000111799 | NA |
| 335 | apolipoprotein A1 | This gene encodes apolipoprotein A-I, which is the major protein component of high density lipoprotein (HDL) in plasma. The encoded preproprotein is proteolytically processed to generate the mature protein, which promotes cholesterol efflux from tissues to the liver for excretion, and is a cofactor for lecithin cholesterolacyltransferase (LCAT), an enzyme responsible for the formation of most plasma cholesteryl esters. This gene is closely linked with two other apolipoprotein genes on chromosome 11. Defects in this gene are associated with HDL deficiencies, including Tangier disease, and with systemic non-neuropathic amyloidosis. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein. | APOA1 | ENSG00000118137 | NA |
| 11034 | destrin, actin depolymerizing factor | The product of this gene belongs to the actin-binding proteins ADF family. This family of proteins is responsible for enhancing the turnover rate of actin in vivo. This gene encodes the actin depolymerizing protein that severs actin filaments (F-actin) and binds to actin monomers (G-actin). Two transcript variants encoding distinct isoforms have been identified for this gene. | DSTN | ENSG00000125868 | NA |
| 51559 | 5’-nucleotidase domain containing 3 | NA | NT5DC3 | ENSG00000111696 | NA |
| 57326 | PBX homeobox interacting protein 1 | The protein encoded by this gene interacts with the PBX1 homeodomain protein, inhibiting its transcriptional activation potential by preventing its binding to DNA. The encoded protein, which is primarily cytosolic but can shuttle to the nucleus, also can interact with estrogen receptors alpha and beta and promote the proliferation of breast cancer, brain tumors, and lung cancer. Several transcript variants encoding different isoforms have been found for this gene. More variants exist, but their full-length natures have yet to be determined. | PBXIP1 | ENSG00000163346 | NA |
| 6175 | ribosomal protein lateral stalk subunit P0 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein, which is the functional equivalent of the E. coli L10 ribosomal protein, belongs to the L10P family of ribosomal proteins. It is a neutral phosphoprotein with a C-terminal end that is nearly identical to the C-terminal ends of the acidic ribosomal phosphoproteins P1 and P2. The P0 protein can interact with P1 and P2 to form a pentameric complex consisting of P1 and P2 dimers, and a P0 monomer. The protein is located in the cytoplasm. Transcript variants derived from alternative splicing exist; they encode the same protein. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | RPLP0 | ENSG00000089157 | NA |
| 308 | annexin A5 | The protein encoded by this gene belongs to the annexin family of calcium-dependent phospholipid binding proteins some of which have been implicated in membrane-related events along exocytotic and endocytotic pathways. Annexin 5 is a phospholipase A2 and protein kinase C inhibitory protein with calcium channel activity and a potential role in cellular signal transduction, inflammation, growth and differentiation. Annexin 5 has also been described as placental anticoagulant protein I, vascular anticoagulant-alpha, endonexin II, lipocortin V, placental protein 4 and anchorin CII. The gene spans 29 kb containing 13 exons, and encodes a single transcript of approximately 1.6 kb and a protein product with a molecular weight of about 35 kDa. | ANXA5 | ENSG00000164111 | NA |
| 10457 | glycoprotein nmb | The protein encoded by this gene is a type I transmembrane glycoprotein which shows homology to the pMEL17 precursor, a melanocyte-specific protein. GPNMB shows expression in the lowly metastatic human melanoma cell lines and xenografts but does not show expression in the highly metastatic cell lines. GPNMB may be involved in growth delay and reduction of metastatic potential. Two transcript variants encoding different isoforms have been found for this gene. | GPNMB | ENSG00000136235 | NA |
| 11030 | RNA binding protein with multiple splicing | This gene encodes a member of the RNA recognition motif family of RNA-binding proteins. The RNA recognition motif is between 80-100 amino acids in length and family members contain one to four copies of the motif. The RNA recognition motif consists of two short stretches of conserved sequence, as well as a few highly conserved hydrophobic residues. The encoded protein has a single, putative RNA recognition motif in its N-terminus. Alternative splicing results in multiple transcript variants encoding different isoforms. | RBPMS | ENSG00000157110 | NA |
| 5644 | protease, serine 1 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | PRSS1 | ENSG00000204983 | NA |
| 3912 | laminin subunit beta 1 | Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Laminins are composed of 3 non identical chains: laminin alpha, beta and gamma (formerly A, B1, and B2, respectively) and they form a cruciform structure consisting of 3 short arms, each formed by a different chain, and a long arm composed of all 3 chains. Each laminin chain is a multidomain protein encoded by a distinct gene. Several isoforms of each chain have been described. Different alpha, beta and gamma chain isomers combine to give rise to different heterotrimeric laminin isoforms which are designated by Arabic numerals in the order of their discovery, i.e. alpha1beta1gamma1 heterotrimer is laminin 1. The biological functions of the different chains and trimer molecules are largely unknown, but some of the chains have been shown to differ with respect to their tissue distribution, presumably reflecting diverse functions in vivo. This gene encodes the beta chain isoform laminin, beta 1. The beta 1 chain has 7 structurally distinct domains which it shares with other beta chain isomers. The C-terminal helical region containing domains I and II are separated by domain alpha, domains III and V contain several EGF-like repeats, and domains IV and VI have a globular conformation. Laminin, beta 1 is expressed in most tissues that produce basement membranes, and is one of the 3 chains constituting laminin 1, the first laminin isolated from Engelbreth-Holm-Swarm (EHS) tumor. A sequence in the beta 1 chain that is involved in cell attachment, chemotaxis, and binding to the laminin receptor was identified and shown to have the capacity to inhibit metastasis. | LAMB1 | ENSG00000091136 | NA |
| 7916 | proline rich coiled-coil 2A | A cluster of genes, BAT1-BAT5, has been localized in the vicinity of the genes for TNF alpha and TNF beta. These genes are all within the human major histocompatibility complex class III region. This gene has microsatellite repeats which are associated with the age-at-onset of insulin-dependent diabetes mellitus (IDDM) and possibly thought to be involved with the inflammatory process of pancreatic beta-cell destruction during the development of IDDM. This gene is also a candidate gene for the development of rheumatoid arthritis. Two transcript variants encoding the same protein have been found for this gene. | PRRC2A | ENSG00000204469 | NA |
| 4313 | matrix metallopeptidase 2 | This gene is a member of the matrix metalloproteinase (MMP) gene family, that are zinc-dependent enzymes capable of cleaving components of the extracellular matrix and molecules involved in signal transduction. The protein encoded by this gene is a gelatinase A, type IV collagenase, that contains three fibronectin type II repeats in its catalytic site that allow binding of denatured type IV and V collagen and elastin. Unlike most MMP family members, activation of this protein can occur on the cell membrane. This enzyme can be activated extracellularly by proteases, or, intracellulary by its S-glutathiolation with no requirement for proteolytical removal of the pro-domain. This protein is thought to be involved in multiple pathways including roles in the nervous system, endometrial menstrual breakdown, regulation of vascularization, and metastasis. Mutations in this gene have been associated with Winchester syndrome and Nodulosis-Arthropathy-Osteolysis (NAO) syndrome. Alternative splicing results in multiple transcript variants encoding different isoforms. | MMP2 | ENSG00000087245 | NA |
| 4892 | nebulin related anchoring protein | NA | NRAP | ENSG00000197893 | NA |
| 336 | apolipoprotein A2 | This gene encodes apolipoprotein (apo-) A-II, which is the second most abundant protein of the high density lipoprotein particles. The protein is found in plasma as a monomer, homodimer, or heterodimer with apolipoprotein D. Defects in this gene may result in apolipoprotein A-II deficiency or hypercholesterolemia. | APOA2 | ENSG00000158874 | NA |
| 30819 | potassium voltage-gated channel interacting protein 2 | This gene encodes a member of the family of voltage-gated potassium (Kv) channel-interacting proteins (KCNIPs), which belongs to the recoverin branch of the EF-hand superfamily. Members of the KCNIP family are small calcium binding proteins. They all have EF-hand-like domains, and differ from each other in the N-terminus. They are integral subunit components of native Kv4 channel complexes. They may regulate A-type currents, and hence neuronal excitability, in response to changes in intracellular calcium. Multiple alternatively spliced transcript variants encoding distinct isoforms have been identified from this gene. | KCNIP2 | ENSG00000120049 | NA |
| 3851 | keratin 4 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | KRT4 | ENSG00000170477 | NA |
| 4624 | myosin, heavy chain 6, cardiac muscle, alpha | Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. | MYH6 | ENSG00000197616 | NA |
| 5066 | peptidylglycine alpha-amidating monooxygenase | This gene encodes a multifunctional protein. The encoded preproprotein is proteolytically processed to generate the mature enzyme. This enzyme includes two domains with distinct catalytic activities, a peptidylglycine alpha-hydroxylating monooxygenase (PHM) domain and a peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) domain. These catalytic domains work sequentially to catalyze the conversion of neuroendocrine peptides to active alpha-amidated products. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that is proteolytically processed. | PAM | ENSG00000145730 | NA |
| 8831 | synaptic Ras GTPase activating protein 1 | The protein encoded by this gene is a major component of the postsynaptic density (PSD), a group of proteins found associated with NMDA receptors at synapses. The encoded protein is phosphorylated by calmodulin-dependent protein kinase II and dephosphorylated by NMDA receptor activation. Defects in this gene are a cause of mental retardation autosomal dominant type 5 (MRD5). | SYNGAP1 | ENSG00000197283 | NA |
| 8557 | titin-cap | Sarcomere assembly is regulated by the muscle protein titin. Titin is a giant elastic protein with kinase activity that extends half the length of a sarcomere. It serves as a scaffold to which myofibrils and other muscle related proteins are attached. This gene encodes a protein found in striated and cardiac muscle that binds to the titin Z1-Z2 domains and is a substrate of titin kinase, interactions thought to be critical to sarcomere assembly. Mutations in this gene are associated with limb-girdle muscular dystrophy type 2G. | TCAP | ENSG00000173991 | NA |
| 1513 | cathepsin K | The protein encoded by this gene is a lysosomal cysteine proteinase involved in bone remodeling and resorption. This protein, which is a member of the peptidase C1 protein family, is predominantly expressed in osteoclasts. However, the encoded protein is also expressed in a significant fraction of human breast cancers, where it could contribute to tumor invasiveness. Mutations in this gene are the cause of pycnodysostosis, an autosomal recessive disease characterized by osteosclerosis and short stature. | CTSK | ENSG00000143387 | NA |
| 3956 | galectin 1 | The galectins are a family of beta-galactoside-binding proteins implicated in modulating cell-cell and cell-matrix interactions. This gene product may act as an autocrine negative growth factor that regulates cell proliferation. | LGALS1 | ENSG00000100097 | NA |
| 348 | apolipoprotein E | The protein encoded by this gene is a major apoprotein of the chylomicron. It binds to a specific liver and peripheral cell receptor, and is essential for the normal catabolism of triglyceride-rich lipoprotein constituents. This gene maps to chromosome 19 in a cluster with the related apolipoprotein C1 and C2 genes. Mutations in this gene result in familial dysbetalipoproteinemia, or type III hyperlipoproteinemia (HLP III), in which increased plasma cholesterol and triglycerides are the consequence of impaired clearance of chylomicron and VLDL remnants. Alternative splicing results in multiple transcript variants. | APOE | ENSG00000130203 | NA |
| 229 | aldolase, fructose-bisphosphate B | Fructose-1,6-bisphosphate aldolase (EC 4.1.2.13) is a tetrameric glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Vertebrates have 3 aldolase isozymes which are distinguished by their electrophoretic and catalytic properties. Differences indicate that aldolases A, B, and C are distinct proteins, the products of a family of related ‘housekeeping’ genes exhibiting developmentally regulated expression of the different isozymes. The developing embryo produces aldolase A, which is produced in even greater amounts in adult muscle where it can be as much as 5% of total cellular protein. In adult liver, kidney and intestine, aldolase A expression is repressed and aldolase B is produced. In brain and other nervous tissue, aldolase A and C are expressed about equally. There is a high degree of homology between aldolase A and C. Defects in ALDOB cause hereditary fructose intolerance. | ALDOB | ENSG00000136872 | NA |
| 23352 | ubiquitin protein ligase E3 component n-recognin 4 | The protein encoded by this gene is an E3 ubiquitin-protein ligase that interacts with the retinoblastoma-associated protein in the nucleus and with calcium-bound calmodulin in the cytoplasm. The encoded protein appears to be a cytoskeletal component in the cytoplasm and part of the chromatin scaffold in the nucleus. In addition, this protein is a target of the human papillomavirus type 16 E7 oncoprotein. | UBR4 | ENSG00000127481 | NA |
| 3242 | 4-hydroxyphenylpyruvate dioxygenase | The protein encoded by this gene is an enzyme in the catabolic pathway of tyrosine. The encoded protein catalyzes the conversion of 4-hydroxyphenylpyruvate to homogentisate. Defects in this gene are a cause of tyrosinemia type 3 (TYRO3) and hawkinsinuria (HAWK). Two transcript variants encoding different isoforms have been found for this gene. | HPD | ENSG00000158104 | NA |
| 4053 | latent transforming growth factor beta binding protein 2 | The protein encoded by this gene belongs to the family of latent transforming growth factor (TGF)-beta binding proteins (LTBP), which are extracellular matrix proteins with multi-domain structure. This protein is the largest member of the LTBP family possessing unique regions and with most similarity to the fibrillins. It has thus been suggested that it may have multiple functions: as a member of the TGF-beta latent complex, as a structural component of microfibrils, and a role in cell adhesion. | LTBP2 | ENSG00000119681 | NA |
| ENSG00000234961 | NA | NA | RP11-124N14.3 | ENSG00000234961 | NA |
| 10136 | chymotrypsin like elastase family member 3A | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. | CELA3A | ENSG00000142789 | NA |
| ENSG00000180139 | ACTA2 antisense RNA 1 | NA | ACTA2-AS1 | ENSG00000180139 | NA |
| 7139 | troponin T2, cardiac type | The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. Mutations in this gene have been associated with familial hypertrophic cardiomyopathy as well as with dilated cardiomyopathy. Transcripts for this gene undergo alternative splicing that results in many tissue-specific isoforms, however, the full-length nature of some of these variants has not yet been determined. | TNNT2 | ENSG00000118194 | NA |
| 81 | actinin alpha 4 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a nonmuscle, alpha actinin isoform which is concentrated in the cytoplasm, and thought to be involved in metastatic processes. Mutations in this gene have been associated with focal and segmental glomerulosclerosis. | ACTN4 | ENSG00000130402 | NA |
| 2006 | elastin | This gene encodes a protein that is one of the two components of elastic fibers. The encoded protein is rich in hydrophobic amino acids such as glycine and proline, which form mobile hydrophobic regions bounded by crosslinks between lysine residues. Deletions and mutations in this gene are associated with supravalvular aortic stenosis (SVAS) and autosomal dominant cutis laxa. Multiple transcript variants encoding different isoforms have been found for this gene. | ELN | ENSG00000049540 | NA |
| 8407 | transgelin 2 | The protein encoded by this gene is similar to the protein transgelin, which is one of the earliest markers of differentiated smooth muscle. The specific function of this protein has not yet been determined, although it is thought to be a tumor suppressor. Multiple transcript variants encoding different isoforms have been found for this gene. | TAGLN2 | ENSG00000158710 | NA |
| 25802 | leiomodin 1 | The leiomodin 1 protein has a putative membrane-spanning region and 2 types of tandemly repeated blocks. The transcript is expressed in all tissues tested, with the highest levels in thyroid, eye muscle, skeletal muscle, and ovary. Increased expression of leiomodin 1 may be linked to Graves’ disease and thyroid-associated ophthalmopathy. | LMOD1 | ENSG00000163431 | NA |
| 3983 | actin binding LIM protein 1 | This gene encodes a cytoskeletal LIM protein that binds to actin filaments via a domain that is homologous to erythrocyte dematin. LIM domains, found in over 60 proteins, play key roles in the regulation of developmental pathways. LIM domains also function as protein-binding interfaces, mediating specific protein-protein interactions. The protein encoded by this gene could mediate such interactions between actin filaments and cytoplasmic targets. Alternatively spliced transcript variants encoding different isoforms have been identified. | ABLIM1 | ENSG00000099204 | NA |
| 59 | actin, alpha 2, smooth muscle, aorta | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | ACTA2 | ENSG00000107796 | NA |
| 58 | actin, alpha 1, skeletal muscle | The product encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Mutations in this gene cause nemaline myopathy type 3, congenital myopathy with excess of thin myofilaments, congenital myopathy with cores, and congenital myopathy with fiber-type disproportion, diseases that lead to muscle fiber defects. | ACTA1 | ENSG00000143632 | NA |
| 7077 | TIMP metallopeptidase inhibitor 2 | This gene is a member of the TIMP gene family. The proteins encoded by this gene family are natural inhibitors of the matrix metalloproteinases, a group of peptidases involved in degradation of the extracellular matrix. In addition to an inhibitory role against metalloproteinases, the encoded protein has a unique role among TIMP family members in its ability to directly suppress the proliferation of endothelial cells. As a result, the encoded protein may be critical to the maintenance of tissue homeostasis by suppressing the proliferation of quiescent tissues in response to angiogenic factors, and by inhibiting protease activity in tissues undergoing remodelling of the extracellular matrix. | TIMP2 | ENSG00000035862 | NA |
| 3798 | kinesin family member 5A | This gene encodes a member of the kinesin family of proteins. Members of this family are part of a multisubunit complex that functions as a microtubule motor in intracellular organelle transport. Mutations in this gene cause autosomal dominant spastic paraplegia 10. | KIF5A | ENSG00000155980 | NA |
| 3164 | nuclear receptor subfamily 4 group A member 1 | This gene encodes a member of the steroid-thyroid hormone-retinoid receptor superfamily. Expression is induced by phytohemagglutinin in human lymphocytes and by serum stimulation of arrested fibroblasts. The encoded protein acts as a nuclear transcription factor. Translocation of the protein from the nucleus to mitochondria induces apoptosis. Multiple transcript variants encoding different isoforms have been found for this gene. | NR4A1 | ENSG00000123358 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",9,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[10,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
kable(as.data.frame(out))
| X_id | symbol | query | name | summary |
|---|---|---|---|---|
| 7178 | TPT1 | ENSG00000133112 | tumor protein, translationally-controlled 1 | NA |
| 1915 | EEF1A1 | ENSG00000156508 | eukaryotic translation elongation factor 1 alpha 1 | This gene encodes an isoform of the alpha subunit of the elongation factor-1 complex, which is responsible for the enzymatic delivery of aminoacyl tRNAs to the ribosome. This isoform (alpha 1) is expressed in brain, placenta, lung, liver, kidney, and pancreas, and the other isoform (alpha 2) is expressed in brain, heart and skeletal muscle. This isoform is identified as an autoantigen in 66% of patients with Felty syndrome. This gene has been found to have multiple copies on many chromosomes, some of which, if not all, represent different pseudogenes. |
| ENSG00000237973 | MTCO1P12 | ENSG00000237973 | MT-CO1 pseudogene 12 | NA |
| 6122 | RPL3 | ENSG00000100316 | ribosomal protein L3 | Ribosomes, the complexes that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L3P family of ribosomal proteins and it is located in the cytoplasm. The protein can bind to the HIV-1 TAR mRNA, and it has been suggested that the protein contributes to tat-mediated transactivation. This gene is co-transcribed with several small nucleolar RNA genes, which are located in several of this gene’s introns. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 23521 | RPL13A | ENSG00000142541 | ribosomal protein L13a | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a member of the L13P family of ribosomal proteins that is a component of the 60S subunit. The encoded protein also plays a role in the repression of inflammatory genes as a component of the IFN-gamma-activated inhibitor of translation (GAIT) complex. This gene is co-transcribed with the small nucleolar RNA genes U32, U33, U34, and U35, which are located in the second, fourth, fifth, and sixth introns, respectively. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed throughout the genome. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. |
| 3488 | IGFBP5 | ENSG00000115461 | insulin like growth factor binding protein 5 | NA |
| 6202 | RPS8 | ENSG00000142937 | ribosomal protein S8 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S8E family of ribosomal proteins. It is located in the cytoplasm. Increased expression of this gene in colorectal tumors and colon polyps compared to matched normal colonic mucosa has been observed. This gene is co-transcribed with the small nucleolar RNA genes U38A, U38B, U39, and U40, which are located in its fourth, fifth, first, and second introns, respectively. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6194 | RPS6 | ENSG00000137154 | ribosomal protein S6 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a cytoplasmic ribosomal protein that is a component of the 40S subunit. The protein belongs to the S6E family of ribosomal proteins. It is the major substrate of protein kinases in the ribosome, with subsets of five C-terminal serine residues phosphorylated by different protein kinases. Phosphorylation is induced by a wide range of stimuli, including growth factors, tumor-promoting agents, and mitogens. Dephosphorylation occurs at growth arrest. The protein may contribute to the control of cell growth and proliferation through the selective translation of particular classes of mRNA. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6137 | RPL13 | ENSG00000167526 | ribosomal protein L13 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L13E family of ribosomal proteins. It is located in the cytoplasm. This gene is expressed at significantly higher levels in benign breast lesions than in breast carcinomas. Alternatively spliced transcript variants encoding distinct isoforms have been found for this gene. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6188 | RPS3 | ENSG00000149273 | ribosomal protein S3 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit, where it forms part of the domain where translation is initiated. The protein belongs to the S3P family of ribosomal proteins. Studies of the mouse and rat proteins have demonstrated that the protein has an extraribosomal role as an endonuclease involved in the repair of UV-induced DNA damage. The protein appears to be located in both the cytoplasm and nucleus but not in the nucleolus. Higher levels of expression of this gene in colon adenocarcinomas and adenomatous polyps compared to adjacent normal colonic mucosa have been observed. This gene is co-transcribed with the small nucleolar RNA genes U15A and U15B, which are located in its first and fifth introns, respectively. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Multiple alternatively spliced transcript variants encoding different isoforms have been found for this gene. |
| 6135 | RPL11 | ENSG00000142676 | ribosomal protein L11 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L5P family of ribosomal proteins. It is located in the cytoplasm. The protein probably associates with the 5S rRNA. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6187 | RPS2 | ENSG00000140988 | ribosomal protein S2 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S5P family of ribosomal proteins. It is located in the cytoplasm. This gene shares sequence similarity with mouse LLRep3. It is co-transcribed with the small nucleolar RNA gene U64, which is located in its third intron. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6203 | RPS9 | ENSG00000170889 | ribosomal protein S9 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S4P family of ribosomal proteins. It is located in the cytoplasm. Variable expression of this gene in colorectal cancers compared to adjacent normal tissues has been observed, although no correlation between the level of expression and the severity of the disease has been found. As is typical for genes encoding ribosomal proteins, multiple processed pseudogenes derived from this gene are dispersed through the genome. |
| 6130 | RPL7A | ENSG00000148303 | ribosomal protein L7a | Cytoplasmic ribosomes, organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L7AE family of ribosomal proteins. It can interact with a subclass of nuclear hormone receptors, including thyroid hormone receptor, and inhibit their ability to transactivate by preventing their binding to their DNA response elements. This gene is included in the surfeit gene cluster, a group of very tightly linked genes that do not share sequence similarity. It is co-transcribed with the U24, U36a, U36b, and U36c small nucleolar RNA genes, which are located in its second, fifth, fourth, and sixth introns, respectively. This gene rearranges with the trk proto-oncogene to form the chimeric oncogene trk-2h, which encodes an oncoprotein consisting of the N terminus of ribosomal protein L7a fused to the receptor tyrosine kinase domain of trk. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 9349 | RPL23 | ENSG00000125691 | ribosomal protein L23 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L14P family of ribosomal proteins. It is located in the cytoplasm. This gene has been referred to as rpL17 because the encoded protein shares amino acid identity with ribosomal protein L17 from Saccharomyces cerevisiae; however, its official symbol is RPL23. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6160 | RPL31 | ENSG00000071082 | ribosomal protein L31 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L31E family of ribosomal proteins. It is located in the cytoplasm. Higher levels of expression of this gene in familial adenomatous polyps compared to matched normal tissues have been observed. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Alternatively spliced transcript variants encoding distinct isoforms have been found for this gene. |
| 6181 | RPLP2 | ENSG00000177600 | ribosomal protein lateral stalk subunit P2 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal phosphoprotein that is a component of the 60S subunit. The protein, which is a functional equivalent of the E. coli L7/L12 ribosomal protein, belongs to the L12P family of ribosomal proteins. It plays an important role in the elongation step of protein synthesis. Unlike most ribosomal proteins, which are basic, the encoded protein is acidic. Its C-terminal end is nearly identical to the C-terminal ends of the ribosomal phosphoproteins P0 and P1. The P2 protein can interact with P0 and P1 to form a pentameric complex consisting of P1 and P2 dimers, and a P0 monomer. The protein is located in the cytoplasm. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6175 | RPLP0 | ENSG00000089157 | ribosomal protein lateral stalk subunit P0 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein, which is the functional equivalent of the E. coli L10 ribosomal protein, belongs to the L10P family of ribosomal proteins. It is a neutral phosphoprotein with a C-terminal end that is nearly identical to the C-terminal ends of the acidic ribosomal phosphoproteins P1 and P2. The P0 protein can interact with P1 and P2 to form a pentameric complex consisting of P1 and P2 dimers, and a P0 monomer. The protein is located in the cytoplasm. Transcript variants derived from alternative splicing exist; they encode the same protein. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6205 | RPS11 | ENSG00000142534 | ribosomal protein S11 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a member of the S17P family of ribosomal proteins that is a component of the 40S subunit. This gene is co-transcribed with the small nucleolar RNA gene U35B, which is located in the third intron. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed throughout the genome. |
| 728658 | RPL13AP5 | ENSG00000236552 | ribosomal protein L13a pseudogene 5 | NA |
| 6143 | RPL19 | ENSG00000108298 | ribosomal protein L19 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L19E family of ribosomal proteins. It is located in the cytoplasm. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 1674 | DES | ENSG00000175084 | desmin | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. |
| 6158 | RPL28 | ENSG00000108107 | ribosomal protein L28 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L28E family of ribosomal proteins. It is located in the cytoplasm. Variable expression of this gene in colorectal cancers compared to adjacent normal tissues has been observed, although no correlation between the level of expression and the severity of the disease has been found. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Alternative splicing results in multiple transcript variants encoding distinct isoforms. |
| 6206 | RPS12 | ENSG00000112306 | ribosomal protein S12 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S12E family of ribosomal proteins. It is located in the cytoplasm. Increased expression of this gene in colorectal cancers compared to matched normal colonic mucosa has been observed. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6136 | RPL12 | ENSG00000197958 | ribosomal protein L12 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L11P family of ribosomal proteins. It is located in the cytoplasm. The protein binds directly to the 26S rRNA. This gene is co-transcribed with the U65 snoRNA, which is located in its fourth intron. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6132 | RPL8 | ENSG00000161016 | ribosomal protein L8 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L2P family of ribosomal proteins. It is located in the cytoplasm. In rat, the protein associates with the 5.8S rRNA, very likely participates in the binding of aminoacyl-tRNA, and is a constituent of the elongation factor 2-binding site at the ribosomal subunit interface. Alternatively spliced transcript variants encoding the same protein exist. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6222 | RPS18 | ENSG00000231500 | ribosomal protein S18 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S13P family of ribosomal proteins. It is located in the cytoplasm. The gene product of the E. coli ortholog (ribosomal protein S13) is involved in the binding of fMet-tRNA, and thus, in the initiation of translation. This gene is an ortholog of mouse Ke3. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6229 | RPS24 | ENSG00000138326 | ribosomal protein S24 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S24E family of ribosomal proteins. It is located in the cytoplasm. Multiple transcript variants encoding different isoforms have been found for this gene. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Mutations in this gene result in Diamond-Blackfan anemia. |
| 6224 | RPS20 | ENSG00000008988 | ribosomal protein S20 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S10P family of ribosomal proteins. It is located in the cytoplasm. This gene is co-transcribed with the small nucleolar RNA gene U54, which is located in its second intron. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Two transcript variants encoding different isoforms have been identified for this gene. |
| 10399 | RACK1 | ENSG00000204628 | receptor for activated C kinase 1 | NA |
| 6217 | RPS16 | ENSG00000105193 | ribosomal protein S16 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S9P family of ribosomal proteins. It is located in the cytoplasm. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 4629 | MYH11 | ENSG00000133392 | myosin, heavy chain 11, smooth muscle | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. |
| 6168 | RPL37A | ENSG00000197756 | ribosomal protein L37a | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L37AE family of ribosomal proteins. It is located in the cytoplasm. The protein contains a C4-type zinc finger-like domain. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6161 | RPL32 | ENSG00000144713 | ribosomal protein L32 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L32E family of ribosomal proteins. It is located in the cytoplasm. Although some studies have mapped this gene to 3q13.3-q21, it is believed to map to 3p25-p24. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Alternatively spliced transcript variants encoding the same protein have been observed for this gene. |
| ENSG00000273149 | RP11-290D2.6 | ENSG00000273149 | NA | NA |
| 6167 | RPL37 | ENSG00000145592 | ribosomal protein L37 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L37E family of ribosomal proteins. It is located in the cytoplasm. The protein contains a C2C2-type zinc finger-like motif. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6141 | RPL18 | ENSG00000063177 | ribosomal protein L18 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a member of the L18E family of ribosomal proteins that is a component of the 60S subunit. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. |
| 6208 | RPS14 | ENSG00000164587 | ribosomal protein S14 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S11P family of ribosomal proteins. It is located in the cytoplasm. Transcript variants utilizing alternative transcription initiation sites have been described in the literature. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. In Chinese hamster ovary cells, mutations in this gene can lead to resistance to emetine, a protein synthesis inhibitor. Multiple alternatively spliced transcript variants encoding the same protein have been found for this gene. |
| 29997 | GLTSCR2 | ENSG00000105373 | glioma tumor suppressor candidate region gene 2 | NA |
| ENSG00000244398 | RP11-466H18.1 | ENSG00000244398 | NA | NA |
| ENSG00000232573 | RPL3P4 | ENSG00000232573 | ribosomal protein L3 pseudogene 4 | NA |
| 6227 | RPS21 | ENSG00000171858 | ribosomal protein S21 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S21E family of ribosomal proteins. It is located in the cytoplasm. Alternative splice variants that encode different protein isoforms have been described, but their existence has not been verified. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6233 | RPS27A | ENSG00000143947 | ribosomal protein S27a | Ubiquitin, a highly conserved protein that has a major role in targeting cellular proteins for degradation by the 26S proteosome, is synthesized as a precursor protein consisting of either polyubiquitin chains or a single ubiquitin fused to an unrelated protein. This gene encodes a fusion protein consisting of ubiquitin at the N terminus and ribosomal protein S27a at the C terminus. When expressed in yeast, the protein is post-translationally processed, generating free ubiquitin monomer and ribosomal protein S27a. Ribosomal protein S27a is a component of the 40S subunit of the ribosome and belongs to the S27AE family of ribosomal proteins. It contains C4-type zinc finger domains and is located in the cytoplasm. Pseudogenes derived from this gene are present in the genome. As with ribosomal protein S27a, ribosomal protein L40 is also synthesized as a fusion protein with ubiquitin; similarly, ribosomal protein S30 is synthesized as a fusion protein with the ubiquitin-like protein fubi. Multiple alternatively spliced transcript variants that encode the same proteins have been identified. |
| 4736 | RPL10A | ENSG00000198755 | ribosomal protein L10a | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L1P family of ribosomal proteins. It is located in the cytoplasm. The expression of this gene is downregulated in the thymus by cyclosporin-A (CsA), an immunosuppressive drug. Studies in mice have shown that the expression of the ribosomal protein L10a gene is downregulated in neural precursor cells during development. This gene previously was referred to as NEDD6 (neural precursor cell expressed, developmentally downregulated 6), but it has been renamed RPL10A (ribosomal protein 10a). As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 3487 | IGFBP4 | ENSG00000141753 | insulin like growth factor binding protein 4 | This gene is a member of the insulin-like growth factor binding protein (IGFBP) family and encodes a protein with an IGFBP domain and a thyroglobulin type-I domain. The protein binds both insulin-like growth factors (IGFs) I and II and circulates in the plasma in both glycosylated and non-glycosylated forms. Binding of this protein prolongs the half-life of the IGFs and alters their interaction with cell surface receptors. |
| ENSG00000196205 | EEF1A1P5 | ENSG00000196205 | eukaryotic translation elongation factor 1 alpha 1 pseudogene 5 | NA |
| 6164 | RPL34 | ENSG00000109475 | ribosomal protein L34 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L34E family of ribosomal proteins. It is located in the cytoplasm. This gene originally was thought to be located at 17q21, but it has been mapped to 4q. Overexpression of this gene has been observed in some cancer cells. Alternative splicing results in multiple transcript variants, all encoding the same isoform. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 1360 | CPB1 | ENSG00000153002 | carboxypeptidase B1 | Three different procarboxypeptidases A and two different procarboxypeptidases B have been isolated. The B1 and B2 forms differ from each other mainly in isoelectric point. Carboxypeptidase B1 is a highly tissue-specific protein and is a useful serum marker for acute pancreatitis and dysfunction of pancreatic transplants. It is not elevated in pancreatic carcinoma. |
| 4155 | MBP | ENSG00000197971 | myelin basic protein | The protein encoded by the classic MBP gene is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. However, MBP-related transcripts are also present in the bone marrow and the immune system. These mRNAs arise from the long MBP gene (otherwise called ‘Golli-MBP’) that contains 3 additional exons located upstream of the classic MBP exons. Alternative splicing from the Golli and the MBP transcription start sites gives rise to 2 sets of MBP-related transcripts and gene products. The Golli mRNAs contain 3 exons unique to Golli-MBP, spliced in-frame to 1 or more MBP exons. They encode hybrid proteins that have N-terminal Golli aa sequence linked to MBP aa sequence. The second family of transcripts contain only MBP exons and produce the well characterized myelin basic proteins. This complex gene structure is conserved among species suggesting that the MBP transcription unit is an integral part of the Golli transcription unit and that this arrangement is important for the function and/or regulation of these genes. |
| ENSG00000229344 | MTCO2P12 | ENSG00000229344 | MT-CO2 pseudogene 12 | NA |
| ENSG00000225630 | MTND2P28 | ENSG00000225630 | mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 2 pseudogene 28 | NA |
| ENSG00000213442 | RPL18AP3 | ENSG00000213442 | ribosomal protein L18a pseudogene 3 | NA |
| ENSG00000227097 | RPS28P7 | ENSG00000227097 | ribosomal protein S28 pseudogene 7 | NA |
| 3921 | RPSA | ENSG00000168028 | ribosomal protein SA | Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Many of the effects of laminin are mediated through interactions with cell surface receptors. These receptors include members of the integrin family, as well as non-integrin laminin-binding proteins. This gene encodes a high-affinity, non-integrin family, laminin receptor 1. This receptor has been variously called 67 kD laminin receptor, 37 kD laminin receptor precursor (37LRP) and p40 ribosome-associated protein. The amino acid sequence of laminin receptor 1 is highly conserved through evolution, suggesting a key biological function. It has been observed that the level of the laminin receptor transcript is higher in colon carcinoma tissue and lung cancer cell line than their normal counterparts. Also, there is a correlation between the upregulation of this polypeptide in cancer cells and their invasive and metastatic phenotype. Multiple copies of this gene exist, however, most of them are pseudogenes thought to have arisen from retropositional events. Two alternatively spliced transcript variants encoding the same protein have been found for this gene. |
| 9045 | RPL14 | ENSG00000188846 | ribosomal protein L14 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L14E family of ribosomal proteins. It contains a basic region-leucine zipper (bZIP)-like domain. The protein is located in the cytoplasm. This gene contains a trinucleotide (GCT) repeat tract whose length is highly polymorphic; these triplet repeats result in a stretch of alanine residues in the encoded protein. Transcript variants utilizing alternative polyA signals and alternative 5’-terminal exons exist but all encode the same protein. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6157 | RPL27A | ENSG00000166441 | ribosomal protein L27a | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L15P family of ribosomal proteins. It is located in the cytoplasm. Variable expression of this gene in colorectal cancers compared to adjacent normal tissues has been observed, although no correlation between the level of expression and the severity of the disease has been found. As is typical for genes encoding ribosomal proteins, multiple processed pseudogenes derived from this gene are dispersed through the genome. |
| 6228 | RPS23 | ENSG00000186468 | ribosomal protein S23 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S12P family of ribosomal proteins. It is located in the cytoplasm. The protein shares significant amino acid similarity with S. cerevisiae ribosomal protein S28. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6159 | RPL29 | ENSG00000162244 | ribosomal protein L29 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a cytoplasmic ribosomal protein that is a component of the 60S subunit. The protein belongs to the L29E family of ribosomal proteins. The protein is also a peripheral membrane protein expressed on the cell surface that directly binds heparin. Although this gene was previously reported to map to 3q29-qter, it is believed that it is located at 3p21.3-p21.2. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 7273 | TTN | ENSG00000155657 | titin | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. |
| 6223 | RPS19 | ENSG00000105372 | ribosomal protein S19 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S19E family of ribosomal proteins. It is located in the cytoplasm. Mutations in this gene cause Diamond-Blackfan anemia (DBA), a constitutional erythroblastopenia characterized by absent or decreased erythroid precursors, in a subset of patients. This suggests a possible extra-ribosomal function for this gene in erythropoietic differentiation and proliferation, in addition to its ribosomal function. Higher expression levels of this gene in some primary colon carcinomas compared to matched normal colon tissues has been observed. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 2495 | FTH1 | ENSG00000167996 | ferritin heavy chain 1 | This gene encodes the heavy subunit of ferritin, the major intracellular iron storage protein in prokaryotes and eukaryotes. It is composed of 24 subunits of the heavy and light ferritin chains. Variation in ferritin subunit composition may affect the rates of iron uptake and release in different tissues. A major function of ferritin is the storage of iron in a soluble and nontoxic state. Defects in ferritin proteins are associated with several neurodegenerative diseases. This gene has multiple pseudogenes. Several alternatively spliced transcript variants have been observed, but their biological validity has not been determined. |
| 6156 | RPL30 | ENSG00000156482 | ribosomal protein L30 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L30E family of ribosomal proteins. It is located in the cytoplasm. This gene is co-transcribed with the U72 small nucleolar RNA gene, which is located in its fourth intron. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 3615 | IMPDH2 | ENSG00000178035 | IMP (inosine 5’-monophosphate) dehydrogenase 2 | This gene encodes the rate-limiting enzyme in the de novo guanine nucleotide biosynthesis. It is thus involved in maintaining cellular guanine deoxy- and ribonucleotide pools needed for DNA and RNA synthesis. The encoded protein catalyzes the NAD-dependent oxidation of inosine-5’-monophosphate into xanthine-5’-monophosphate, which is then converted into guanosine-5’-monophosphate. This gene is up-regulated in some neoplasms, suggesting it may play a role in malignant transformation. |
| 6193 | RPS5 | ENSG00000083845 | ribosomal protein S5 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S7P family of ribosomal proteins. It is located in the cytoplasm. Variable expression of this gene in colorectal cancers compared to adjacent normal tissues has been observed, although no correlation between the level of expression and the severity of the disease has been found. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6165 | RPL35A | ENSG00000182899 | ribosomal protein L35a | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L35AE family of ribosomal proteins. It is located in the cytoplasm. The rat protein has been shown to bind to both initiator and elongator tRNAs, and thus, it is located at the P site, or P and A sites, of the ribosome. Although this gene was originally mapped to chromosome 18, it has been established that it is located at 3q29-qter. Alternative splicing results in multiple transcript variants. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 6235 | RPS29 | ENSG00000213741 | ribosomal protein S29 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit and a member of the S14P family of ribosomal proteins. The protein, which contains a C2-C2 zinc finger-like domain that can bind to zinc, can enhance the tumor suppressor activity of Ras-related protein 1A (KREV1). It is located in the cytoplasm. Variable expression of this gene in colorectal cancers compared to adjacent normal tissues has been observed, although no correlation between the level of expression and the severity of the disease has been found. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. |
| 1357 | CPA1 | ENSG00000091704 | carboxypeptidase A1 | This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. |
| 1056 | CEL | ENSG00000170835 | carboxyl ester lipase | The protein encoded by this gene is a glycoprotein secreted from the pancreas into the digestive tract and from the lactating mammary gland into human milk. The physiological role of this protein is in cholesterol and lipid-soluble vitamin ester hydrolysis and absorption. This encoded protein promotes large chylomicron production in the intestine. Also its presence in plasma suggests its interactions with cholesterol and oxidized lipoproteins to modulate the progression of atherosclerosis. In pancreatic tumoral cells, this encoded protein is thought to be sequestrated within the Golgi compartment and is probably not secreted. This gene contains a variable number of tandem repeat (VNTR) polymorphism in the coding region that may influence the function of the encoded protein. |
| 6125 | RPL5 | ENSG00000122406 | ribosomal protein L5 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L18P family of ribosomal proteins. It is located in the cytoplasm. The protein binds 5S rRNA to form a stable complex called the 5S ribonucleoprotein particle (RNP), which is necessary for the transport of nonribosome-associated cytoplasmic 5S rRNA to the nucleolus for assembly into ribosomes. The protein interacts specifically with the beta subunit of casein kinase II. Variable expression of this gene in colorectal cancers compared to adjacent normal tissues has been observed, although no correlation between the level of expression and the severity of the disease has been found. This gene is co-transcribed with the small nucleolar RNA gene U21, which is located in its fifth intron. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 71 | ACTG1 | ENSG00000184009 | actin gamma 1 | Actins are highly conserved proteins that are involved in various types of cell motility, and maintenance of the cytoskeleton. In vertebrates, three main groups of actin isoforms, alpha, beta and gamma have been identified. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton, and as mediators of internal cell motility. Actin, gamma 1, encoded by this gene, is a cytoplasmic actin found in non-muscle cells. Mutations in this gene are associated with DFNA20/26, a subtype of autosomal dominant non-syndromic sensorineural progressive hearing loss. Alternative splicing results in multiple transcript variants. |
| 125144 | LRRC75A-AS1 | ENSG00000175061 | LRRC75A antisense RNA 1 | NA |
| 6209 | RPS15 | ENSG00000115268 | ribosomal protein S15 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S19P family of ribosomal proteins. It is located in the cytoplasm. This gene has been found to be activated in various tumors, such as insulinomas, esophageal cancers, and colon cancers. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Alternative splicing results in multiple transcript variants. |
| 1933 | EEF1B2 | ENSG00000114942 | eukaryotic translation elongation factor 1 beta 2 | This gene encodes a translation elongation factor. The protein is a guanine nucleotide exchange factor involved in the transfer of aminoacylated tRNAs to the ribosome. Alternative splicing results in three transcript variants which differ only in the 5’ UTR. |
| ENSG00000242071 | RPL7AP6 | ENSG00000242071 | ribosomal protein L7a pseudogene 6 | NA |
| 5644 | PRSS1 | ENSG00000204983 | protease, serine 1 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. |
| 2813 | GP2 | ENSG00000169347 | glycoprotein 2 | This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. |
| ENSG00000240342 | RPS2P5 | ENSG00000240342 | ribosomal protein S2 pseudogene 5 | NA |
| 6169 | RPL38 | ENSG00000172809 | ribosomal protein L38 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L38E family of ribosomal proteins. It is located in the cytoplasm. Alternative splice variants have been identified, both encoding the same protein. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome, including one located in the promoter region of the type 1 angiotensin II receptor gene. |
| 11224 | RPL35 | ENSG00000136942 | ribosomal protein L35 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L29P family of ribosomal proteins. It is located in the cytoplasm. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 302 | ANXA2 | ENSG00000182718 | annexin A2 | This gene encodes a member of the annexin family. Members of this calcium-dependent phospholipid-binding protein family play a role in the regulation of cellular growth and in signal transduction pathways. This protein functions as an autocrine factor which heightens osteoclast formation and bone resorption. This gene has three pseudogenes located on chromosomes 4, 9 and 10, respectively. Multiple alternatively spliced transcript variants encoding different isoforms have been found for this gene. |
| 84525 | HOPX | ENSG00000171476 | HOP homeobox | The protein encoded by this gene is a homeodomain protein that lacks certain conserved residues required for DNA binding. It was reported that choriocarcinoma cell lines and tissues failed to express this gene, which suggested the possible involvement of this gene in malignant conversion of placental trophoblasts. Studies in mice suggest that this protein may interact with serum response factor (SRF) and modulate SRF-dependent cardiac-specific gene expression and cardiac development. Multiple alternatively spliced transcript variants have been identified for this gene. |
| 4878 | NPPA | ENSG00000175206 | natriuretic peptide A | The protein encoded by this gene belongs to the natriuretic peptide family. Natriuretic peptides are implicated in the control of extracellular fluid volume and electrolyte homeostasis. This protein is synthesized as a large precursor (containing a signal peptide), which is processed to release a peptide from the N-terminus with similarity to vasoactive peptide, cardiodilatin, and another peptide from the C-terminus with natriuretic-diuretic activity. Mutations in this gene have been associated with atrial fibrillation familial type 6. This gene is located adjacent to another member of the natriuretic family of peptides on chromosome 1. |
| 4192 | MDK | ENSG00000110492 | midkine (neurite growth-promoting factor 2) | This gene encodes a member of a small family of secreted growth factors that binds heparin and responds to retinoic acid. The encoded protein promotes cell growth, migration, and angiogenesis, in particular during tumorigenesis. This gene has been targeted as a therapeutic for a variety of different disorders. Alternatively spliced transcript variants encoding multiple isoforms have been observed. |
| 6171 | RPL41 | ENSG00000229117 | ribosomal protein L41 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein, which shares sequence similarity with the yeast ribosomal protein YL41, belongs to the L41E family of ribosomal proteins. It is located in the cytoplasm. The protein can interact with the beta subunit of protein kinase CKII and can stimulate the phosphorylation of DNA topoisomerase II-alpha by CKII. Two alternative splice variants have been identified, both encoding the same protein. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| ENSG00000137970 | RPL7P9 | ENSG00000137970 | ribosomal protein L7 pseudogene 9 | NA |
| 6155 | RPL27 | ENSG00000131469 | ribosomal protein L27 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L27E family of ribosomal proteins. It is located in the cytoplasm. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| 1277 | COL1A1 | ENSG00000108821 | collagen type I alpha 1 | This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIA, Ehlers-Danlos syndrome Classical type, Caffey Disease and idiopathic osteoporosis. Reciprocal translocations between chromosomes 17 and 22, where this gene and the gene for platelet-derived growth factor beta are located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans, resulting from unregulated expression of the growth factor. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. |
| 291 | SLC25A4 | ENSG00000151729 | solute carrier family 25 member 4 | This gene is a member of the mitochondrial carrier subfamily of solute carrier protein genes. The product of this gene functions as a gated pore that translocates ADP from the cytoplasm into the mitochondrial matrix and ATP from the mitochondrial matrix into the cytoplasm. The protein forms a homodimer embedded in the inner mitochondria membrane. Mutations in this gene have been shown to result in autosomal dominant progressive external opthalmoplegia and familial hypertrophic cardiomyopathy. |
| 1938 | EEF2 | ENSG00000167658 | eukaryotic translation elongation factor 2 | This gene encodes a member of the GTP-binding translation elongation factor family. This protein is an essential factor for protein synthesis. It promotes the GTP-dependent translocation of the nascent protein chain from the A-site to the P-site of the ribosome. This protein is completely inactivated by EF-2 kinase phosporylation. |
| 10136 | CELA3A | ENSG00000142789 | chymotrypsin like elastase family member 3A | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. |
| ENSG00000213553 | RPLP0P6 | ENSG00000213553 | ribosomal protein, large, P0 pseudogene 6 | NA |
| 6201 | RPS7 | ENSG00000171863 | ribosomal protein S7 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S7E family of ribosomal proteins. It is located in the cytoplasm. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| ENSG00000266844 | RP11-862L9.3 | ENSG00000266844 | NA | NA |
| ENSG00000227081 | RP11-543P15.1 | ENSG00000227081 | NA | NA |
| ENSG00000230202 | RP11-632C17__A.1 | ENSG00000230202 | NA | NA |
| 6154 | RPL26 | ENSG00000161970 | ribosomal protein L26 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L24P family of ribosomal proteins. It is located in the cytoplasm. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Mutations in this gene result in Diamond-Blackfan anemia. Alternative splicing results in multiple transcript variants. |
| 6142 | RPL18A | ENSG00000105640 | ribosomal protein L18a | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a member of the L18AE family of ribosomal proteins that is a component of the 60S subunit. The encoded protein may play a role in viral replication by interacting with the hepatitis C virus internal ribosome entry site (IRES). This gene is co-transcribed with the U68 snoRNA, located within the third intron. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed throughout the genome. |
| 6230 | RPS25 | ENSG00000118181 | ribosomal protein S25 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S25E family of ribosomal proteins. It is located in the cytoplasm. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. |
| ENSG00000234851 | RPL23AP42 | ENSG00000234851 | ribosomal protein L23a pseudogene 42 | NA |
| ENSG00000234797 | RPS3AP6 | ENSG00000234797 | ribosomal protein S3A pseudogene 6 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",10,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[11,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| query | symbol | X_id | summary | name | notfound |
|---|---|---|---|---|---|
| ENSG00000163017 | ACTG2 | 72 | Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | actin, gamma 2, smooth muscle, enteric | NA |
| ENSG00000106624 | AEBP1 | 165 | This gene encodes a member of carboxypeptidase A protein family. The encoded protein may function as a transcriptional repressor and play a role in adipogenesis and smooth muscle cell differentiation. Studies in mice suggest that this gene functions in wound healing and abdominal wall development. Overexpression of this gene is associated with glioblastoma. | AE binding protein 1 | NA |
| ENSG00000026025 | VIM | 7431 | This gene encodes a member of the intermediate filament family. Intermediate filamentents, along with microtubules and actin microfilaments, make up the cytoskeleton. The protein encoded by this gene is responsible for maintaining cell shape, integrity of the cytoplasm, and stabilizing cytoskeletal interactions. It is also involved in the immune response, and controls the transport of low-density lipoprotein (LDL)-derived cholesterol from a lysosome to the site of esterification. It functions as an organizer of a number of critical proteins involved in attachment, migration, and cell signaling. Mutations in this gene causes a dominant, pulverulent cataract. | vimentin | NA |
| ENSG00000172867 | KRT2 | 3849 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | keratin 2 | NA |
| ENSG00000133392 | MYH11 | 4629 | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | myosin, heavy chain 11, smooth muscle | NA |
| ENSG00000075624 | ACTB | 60 | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | actin, beta | NA |
| ENSG00000042832 | TG | 7038 | Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. | thyroglobulin | NA |
| ENSG00000117289 | NA | NA | NA | NA | TRUE |
| ENSG00000096696 | DSP | 1832 | This gene encodes a protein that anchors intermediate filaments to desmosomal plaques and forms an obligate component of functional desmosomes. Mutations in this gene are the cause of several cardiomyopathies and keratodermas, including skin fragility-woolly hair syndrome. Alternative splicing results in multiple transcript variants. | desmoplakin | NA |
| ENSG00000225630 | MTND2P28 | ENSG00000225630 | NA | mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 2 pseudogene 28 | NA |
| ENSG00000169710 | FASN | 2194 | The enzyme encoded by this gene is a multifunctional protein. Its main function is to catalyze the synthesis of palmitate from acetyl-CoA and malonyl-CoA, in the presence of NADPH, into long-chain saturated fatty acids. In some cancer cell lines, this protein has been found to be fused with estrogen receptor-alpha (ER-alpha), in which the N-terminus of FAS is fused in-frame with the C-terminus of ER-alpha. | fatty acid synthase | NA |
| ENSG00000112096 | LOC100129518 | 100129518 | NA | uncharacterized LOC100129518 | NA |
| ENSG00000112096 | SOD2 | 6648 | This gene is a member of the iron/manganese superoxide dismutase family. It encodes a mitochondrial protein that forms a homotetramer and binds one manganese ion per subunit. This protein binds to the superoxide byproducts of oxidative phosphorylation and converts them to hydrogen peroxide and diatomic oxygen. Mutations in this gene have been associated with idiopathic cardiomyopathy (IDC), premature aging, sporadic motor neuron disease, and cancer. Alternative splicing of this gene results in multiple transcript variants. A related pseudogene has been identified on chromosome 1. | superoxide dismutase 2, mitochondrial | NA |
| ENSG00000130176 | CNN1 | 1264 | NA | calponin 1 | NA |
| ENSG00000186395 | KRT10 | 3858 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | keratin 10 | NA |
| ENSG00000155657 | TTN | 7273 | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | titin | NA |
| ENSG00000143847 | PPFIA4 | 8497 | PPFIA4, or liprin-alpha-4, belongs to the liprin-alpha gene family. See liprin-alpha-1 (LIP1, or PPFIA1; MIM 611054) for background on liprins. | PTPRF interacting protein alpha 4 | NA |
| ENSG00000269936 | RP11-394O4.5 | ENSG00000269936 | NA | NA | NA |
| ENSG00000162896 | PIGR | 5284 | This gene is a member of the immunoglobulin superfamily. The encoded poly-Ig receptor binds polymeric immunoglobulin molecules at the basolateral surface of epithelial cells; the complex is then transported across the cell to be secreted at the apical surface. A significant association was found between immunoglobulin A nephropathy and several SNPs in this gene. | polymeric immunoglobulin receptor | NA |
| ENSG00000099194 | SCD | 6319 | This gene encodes an enzyme involved in fatty acid biosynthesis, primarily the synthesis of oleic acid. The protein belongs to the fatty acid desaturase family and is an integral membrane protein located in the endoplasmic reticulum. Transcripts of approximately 3.9 and 5.2 kb, differing only by alternative polyadenlyation signals, have been detected. A gene encoding a similar enzyme is located on chromosome 4 and a pseudogene of this gene is located on chromosome 17. | stearoyl-CoA desaturase | NA |
| ENSG00000100345 | MYH9 | 4627 | This gene encodes a conventional non-muscle myosin; this protein should not be confused with the unconventional myosin-9a or 9b (MYO9A or MYO9B). The encoded protein is a myosin IIA heavy chain that contains an IQ domain and a myosin head-like domain which is involved in several important functions, including cytokinesis, cell motility and maintenance of cell shape. Defects in this gene have been associated with non-syndromic sensorineural deafness autosomal dominant type 17, Epstein syndrome, Alport syndrome with macrothrombocytopenia, Sebastian syndrome, Fechtner syndrome and macrothrombocytopenia with progressive sensorineural deafness. | myosin, heavy chain 9, non-muscle | NA |
| ENSG00000159251 | ACTC1 | 70 | Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC). | actin, alpha, cardiac muscle 1 | NA |
| ENSG00000196091 | MYBPC1 | 4604 | This gene encodes a member of the myosin-binding protein C family. Myosin-binding protein C family members are myosin-associated proteins found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The encoded protein is the slow skeletal muscle isoform of myosin-binding protein C and plays an important role in muscle contraction by recruiting muscle-type creatine kinase to myosin filaments. Mutations in this gene are associated with distal arthrogryposis type I. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | myosin binding protein C, slow type | NA |
| ENSG00000049323 | LTBP1 | 4052 | The protein encoded by this gene belongs to the family of latent TGF-beta binding proteins (LTBPs). The secretion and activation of TGF-betas is regulated by their association with latency-associated proteins and with latent TGF-beta binding proteins. The product of this gene targets latent complexes of transforming growth factor beta to the extracellular matrix, where the latent cytokine is subsequently activated by several different mechanisms. Alternatively spliced transcript variants encoding different isoforms have been identified. | latent transforming growth factor beta binding protein 1 | NA |
| ENSG00000185303 | SFTPA2 | 729238 | This gene is one of several genes encoding pulmonary-surfactant associated proteins (SFTPA) located on chromosome 10. Mutations in this gene and a highly similar gene located nearby, which affect the highly conserved carbohydrate recognition domain, are associated with idiopathic pulmonary fibrosis. The current version of the assembly displays only a single centromeric SFTPA gene pair rather than the two gene pairs shown in the previous assembly which were thought to have resulted from a duplication. | surfactant protein A2 | NA |
| ENSG00000109472 | CPE | 1363 | This gene encodes a member of the M14 family of metallocarboxypeptidases. The encoded preproprotein is proteolytically processed to generate the mature peptidase. This peripheral membrane protein cleaves C-terminal amino acid residues and is involved in the biosynthesis of peptide hormones and neurotransmitters, including insulin. This protein may also function independently of its peptidase activity, as a neurotrophic factor that promotes neuronal survival, and as a sorting receptor that binds to regulated secretory pathway proteins, including prohormones. Mutations in this gene are implicated in type 2 diabetes. | carboxypeptidase E | NA |
| ENSG00000188536 | HBA2 | 3040 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | hemoglobin subunit alpha 2 | NA |
| ENSG00000108515 | ENO3 | 2027 | This gene encodes one of the three enolase isoenzymes found in mammals. This isoenzyme is found in skeletal muscle cells in the adult where it may play a role in muscle development and regeneration. A switch from alpha enolase to beta enolase occurs in muscle tissue during development in rodents. Mutations in this gene have be associated glycogen storage disease. Alternatively spliced transcript variants encoding different isoforms have been described. | enolase 3 | NA |
| ENSG00000168484 | SFTPC | 6440 | This gene encodes the pulmonary-associated surfactant protein C (SPC), an extremely hydrophobic surfactant protein essential for lung function and homeostasis after birth. Pulmonary surfactant is a surface-active lipoprotein complex composed of 90% lipids and 10% proteins which include plasma proteins and apolipoproteins SPA, SPB, SPC and SPD. The surfactant is secreted by the alveolar cells of the lung and maintains the stability of pulmonary tissue by reducing the surface tension of fluids that coat the lung. Multiple mutations in this gene have been identified, which cause pulmonary surfactant metabolism dysfunction type 2, also called pulmonary alveolar proteinosis due to surfactant protein C deficiency, and are associated with interstitial lung disease in older infants, children, and adults. Alternatively spliced transcript variants encoding different protein isoforms have been identified. | surfactant protein C | NA |
| ENSG00000189058 | APOD | 347 | This gene encodes a component of high density lipoprotein that has no marked similarity to other apolipoprotein sequences. It has a high degree of homology to plasma retinol-binding protein and other members of the alpha 2 microglobulin protein superfamily of carrier proteins, also known as lipocalins. This glycoprotein is closely associated with the enzyme lecithin:cholesterol acyltransferase - an enzyme involved in lipoprotein metabolism. | apolipoprotein D | NA |
| ENSG00000148677 | ANKRD1 | 27063 | The protein encoded by this gene is localized to the nucleus of endothelial cells and is induced by IL-1 and TNF-alpha stimulation. Studies in rat cardiomyocytes suggest that this gene functions as a transcription factor. Interactions between this protein and the sarcomeric proteins myopalladin and titin suggest that it may also be involved in the myofibrillar stretch-sensor system. | ankyrin repeat domain 1 | NA |
| ENSG00000197616 | MYH6 | 4624 | Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. | myosin, heavy chain 6, cardiac muscle, alpha | NA |
| ENSG00000122852 | SFTPA1 | 653509 | This gene encodes a lung surfactant protein that is a member of a subfamily of C-type lectins called collectins. The encoded protein binds specific carbohydrate moieties found on lipids and on the surface of microorganisms. This protein plays an essential role in surfactant homeostasis and in the defense against respiratory pathogens. Mutations in this gene are associated with idiopathic pulmonary fibrosis. Alternate splicing results in multiple transcript variants. | surfactant protein A1 | NA |
| ENSG00000159176 | CSRP1 | 1465 | This gene encodes a member of the cysteine-rich protein (CSRP) family. This gene family includes a group of LIM domain proteins, which may be involved in regulatory processes important for development and cellular differentiation. The LIM/double zinc-finger motif found in this gene product occurs in proteins with critical functions in gene regulation, cell growth, and somatic differentiation. Alternatively spliced transcript variants have been described. | cysteine and glycine rich protein 1 | NA |
| ENSG00000170835 | CEL | 1056 | The protein encoded by this gene is a glycoprotein secreted from the pancreas into the digestive tract and from the lactating mammary gland into human milk. The physiological role of this protein is in cholesterol and lipid-soluble vitamin ester hydrolysis and absorption. This encoded protein promotes large chylomicron production in the intestine. Also its presence in plasma suggests its interactions with cholesterol and oxidized lipoproteins to modulate the progression of atherosclerosis. In pancreatic tumoral cells, this encoded protein is thought to be sequestrated within the Golgi compartment and is probably not secreted. This gene contains a variable number of tandem repeat (VNTR) polymorphism in the coding region that may influence the function of the encoded protein. | carboxyl ester lipase | NA |
| ENSG00000163453 | IGFBP7 | 3490 | This gene encodes a member of the insulin-like growth factor (IGF)-binding protein (IGFBP) family. IGFBPs bind IGFs with high affinity, and regulate IGF availability in body fluids and tissues and modulate IGF binding to its receptors. This protein binds IGF-I and IGF-II with relatively low affinity, and belongs to a subfamily of low-affinity IGFBPs. It also stimulates prostacyclin production and cell adhesion. Alternatively spliced transcript variants encoding different isoforms have been described for this gene, and one variant has been associated with retinal arterial macroaneurysm (PMID:21835307). | insulin like growth factor binding protein 7 | NA |
| ENSG00000196616 | ADH1B | 125 | The protein encoded by this gene is a member of the alcohol dehydrogenase family. Members of this enzyme family metabolize a wide variety of substrates, including ethanol, retinol, other aliphatic alcohols, hydroxysteroids, and lipid peroxidation products. This encoded protein, consisting of several homo- and heterodimers of alpha, beta, and gamma subunits, exhibits high activity for ethanol oxidation and plays a major role in ethanol catabolism. Three genes encoding alpha, beta and gamma subunits are tandemly organized in a genomic segment as a gene cluster. Two transcript variants encoding different isoforms have been found for this gene. | alcohol dehydrogenase 1B (class I), beta polypeptide | NA |
| ENSG00000182871 | COL18A1 | 80781 | This gene encodes the alpha chain of type XVIII collagen. This collagen is one of the multiplexins, extracellular matrix proteins that contain multiple triple-helix domains (collagenous domains) interrupted by non-collagenous domains. A long isoform of the protein has an N-terminal domain that is homologous to the extracellular part of frizzled receptors. Proteolytic processing at several endogenous cleavage sites in the C-terminal domain results in production of endostatin, a potent antiangiogenic protein that is able to inhibit angiogenesis and tumor growth. Mutations in this gene are associated with Knobloch syndrome. The main features of this syndrome involve retinal abnormalities, so type XVIII collagen may play an important role in retinal structure and in neural tube closure. Alternative splicing results in multiple transcript variants. | collagen type XVIII alpha 1 chain | NA |
| ENSG00000204388 | HSPA1B | 3304 | This intronless gene encodes a 70kDa heat shock protein which is a member of the heat shock protein 70 family. In conjuction with other heat shock proteins, this protein stabilizes existing proteins against aggregation and mediates the folding of newly translated proteins in the cytosol and in organelles. It is also involved in the ubiquitin-proteasome pathway through interaction with the AU-rich element RNA-binding protein 1. The gene is located in the major histocompatibility complex class III region, in a cluster with two closely related genes which encode similar proteins. | heat shock protein family A (Hsp70) member 1B | NA |
| ENSG00000109061 | MYH1 | 4619 | Myosin is a major contractile protein which converts chemical energy into mechanical energy through the hydrolysis of ATP. Myosin is a hexameric protein composed of a pair of myosin heavy chains (MYH) and two pairs of nonidentical light chains. Myosin heavy chains are encoded by a multigene family. In mammals at least 10 different myosin heavy chain (MYH) isoforms have been described from striated, smooth, and nonmuscle cells. These isoforms show expression that is spatially and temporally regulated during development. | myosin, heavy chain 1, skeletal muscle, adult | NA |
| ENSG00000167658 | EEF2 | 1938 | This gene encodes a member of the GTP-binding translation elongation factor family. This protein is an essential factor for protein synthesis. It promotes the GTP-dependent translocation of the nascent protein chain from the A-site to the P-site of the ribosome. This protein is completely inactivated by EF-2 kinase phosporylation. | eukaryotic translation elongation factor 2 | NA |
| ENSG00000149591 | TAGLN | 6876 | The protein encoded by this gene is a transformation and shape-change sensitive actin cross-linking/gelling protein found in fibroblasts and smooth muscle. Its expression is down-regulated in many cell lines, and this down-regulation may be an early and sensitive marker for the onset of transformation. A functional role of this protein is unclear. Two transcript variants encoding the same protein have been found for this gene. | transgelin | NA |
| ENSG00000185650 | ZFP36L1 | 677 | This gene is a member of the TIS11 family of early response genes, which are induced by various agonists such as the phorbol ester TPA and the polypeptide mitogen EGF. This gene is well conserved across species and has a promoter that contains motifs seen in other early-response genes. The encoded protein contains a distinguishing putative zinc finger domain with a repeating cys-his motif. This putative nuclear transcription factor most likely functions in regulating the response to growth factors. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ZFP36 ring finger protein-like 1 | NA |
| ENSG00000171401 | KRT13 | 3860 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | keratin 13 | NA |
| ENSG00000237973 | MTCO1P12 | ENSG00000237973 | NA | MT-CO1 pseudogene 12 | NA |
| ENSG00000065534 | MYLK | 4638 | This gene, a muscle member of the immunoglobulin gene superfamily, encodes myosin light chain kinase which is a calcium/calmodulin dependent enzyme. This kinase phosphorylates myosin regulatory light chains to facilitate myosin interaction with actin filaments to produce contractile activity. This gene encodes both smooth muscle and nonmuscle isoforms. In addition, using a separate promoter in an intron in the 3’ region, it encodes telokin, a small protein identical in sequence to the C-terminus of myosin light chain kinase, that is independently expressed in smooth muscle and functions to stabilize unphosphorylated myosin filaments. A pseudogene is located on the p arm of chromosome 3. Four transcript variants that produce four isoforms of the calcium/calmodulin dependent enzyme have been identified as well as two transcripts that produce two isoforms of telokin. Additional variants have been identified but lack full length transcripts. | myosin light chain kinase | NA |
| ENSG00000182253 | SYNM | 23336 | The protein encoded by this gene is an intermediate filament (IF) family member. IF proteins are cytoskeletal proteins that confer resistance to mechanical stress and are encoded by a dispersed multigene family. This protein has been found to form a linkage between desmin, which is a subunit of the IF network, and the extracellular matrix, and provides an important structural support in muscle. Two alternatively spliced variants encoding different isoforms have been described for this gene. | synemin | NA |
| ENSG00000166819 | PLIN1 | 5346 | The protein encoded by this gene coats lipid storage droplets in adipocytes, thereby protecting them until they can be broken down by hormone-sensitive lipase. The encoded protein is the major cAMP-dependent protein kinase substrate in adipocytes and, when unphosphorylated, may play a role in the inhibition of lipolysis. Alternatively spliced transcript variants varying in the 5’ UTR, but encoding the same protein, have been found for this gene. | perilipin 1 | NA |
| ENSG00000203782 | LOR | 4014 | This gene encodes loricrin, a major protein component of the cornified cell envelope found in terminally differentiated epidermal cells. Mutations in this gene are associated with Vohwinkel’s syndrome and progressive symmetric erythrokeratoderma, both inherited skin diseases. | loricrin | NA |
| ENSG00000131095 | GFAP | 2670 | This gene encodes one of the major intermediate filament proteins of mature astrocytes. It is used as a marker to distinguish astrocytes from other glial cells during development. Mutations in this gene cause Alexander disease, a rare disorder of astrocytes in the central nervous system. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | glial fibrillary acidic protein | NA |
| ENSG00000211445 | GPX3 | 2878 | This gene product belongs to the glutathione peroxidase family, which functions in the detoxification of hydrogen peroxide. It contains a selenocysteine (Sec) residue at its active site. The selenocysteine is encoded by the UGA codon, which normally signals translation termination. The 3’ UTR of Sec-containing genes have a common stem-loop structure, the sec insertion sequence (SECIS), which is necessary for the recognition of UGA as a Sec codon rather than as a stop signal. | glutathione peroxidase 3 | NA |
| ENSG00000211896 | IGHG1 | ENSG00000211896 | NA | immunoglobulin heavy constant gamma 1 (G1m marker) | NA |
| ENSG00000099204 | ABLIM1 | 3983 | This gene encodes a cytoskeletal LIM protein that binds to actin filaments via a domain that is homologous to erythrocyte dematin. LIM domains, found in over 60 proteins, play key roles in the regulation of developmental pathways. LIM domains also function as protein-binding interfaces, mediating specific protein-protein interactions. The protein encoded by this gene could mediate such interactions between actin filaments and cytoplasmic targets. Alternatively spliced transcript variants encoding different isoforms have been identified. | actin binding LIM protein 1 | NA |
| ENSG00000130203 | APOE | 348 | The protein encoded by this gene is a major apoprotein of the chylomicron. It binds to a specific liver and peripheral cell receptor, and is essential for the normal catabolism of triglyceride-rich lipoprotein constituents. This gene maps to chromosome 19 in a cluster with the related apolipoprotein C1 and C2 genes. Mutations in this gene result in familial dysbetalipoproteinemia, or type III hyperlipoproteinemia (HLP III), in which increased plasma cholesterol and triglycerides are the consequence of impaired clearance of chylomicron and VLDL remnants. Alternative splicing results in multiple transcript variants. | apolipoprotein E | NA |
| ENSG00000206172 | HBA1 | 3039 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | hemoglobin subunit alpha 1 | NA |
| ENSG00000158887 | MPZ | 4359 | This gene is specifically expressed in Schwann cells of the peripheral nervous system and encodes a type I transmembrane glycoprotein that is a major structural protein of the peripheral myelin sheath. The encoded protein contains a large hydrophobic extracellular domain and a smaller basic intracellular domain, which are essential for the formation and stabilization of the multilamellar structure of the compact myelin. Mutations in this gene are associated with autosomal dominant form of Charcot-Marie-Tooth disease type 1 (CMT1B) and other polyneuropathies, such as Dejerine-Sottas syndrome (DSS) and congenital hypomyelinating neuropathy (CHN). A recent study showed that two isoforms are produced from the same mRNA by use of alternative in-frame translation termination codons via a stop codon readthrough mechanism. | myelin protein zero | NA |
| ENSG00000142156 | COL6A1 | 1291 | The collagens are a superfamily of proteins that play a role in maintaining the integrity of various tissues. Collagens are extracellular matrix proteins and have a triple-helical domain as their common structural element. Collagen VI is a major structural component of microfibrils. The basic structural unit of collagen VI is a heterotrimer of the alpha1(VI), alpha2(VI), and alpha3(VI) chains. The alpha2(VI) and alpha3(VI) chains are encoded by the COL6A2 and COL6A3 genes, respectively. The protein encoded by this gene is the alpha 1 subunit of type VI collagen (alpha1(VI) chain). Mutations in the genes that code for the collagen VI subunits result in the autosomal dominant disorder, Bethlem myopathy. | collagen type VI alpha 1 | NA |
| ENSG00000167588 | GPD1 | 2819 | This gene encodes a member of the NAD-dependent glycerol-3-phosphate dehydrogenase family. The encoded protein plays a critical role in carbohydrate and lipid metabolism by catalyzing the reversible conversion of dihydroxyacetone phosphate (DHAP) and reduced nicotine adenine dinucleotide (NADH) to glycerol-3-phosphate (G3P) and NAD+. The encoded cytosolic protein and mitochondrial glycerol-3-phosphate dehydrogenase also form a glycerol phosphate shuttle that facilitates the transfer of reducing equivalents from the cytosol to mitochondria. Mutations in this gene are a cause of transient infantile hypertriglyceridemia. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | glycerol-3-phosphate dehydrogenase 1 | NA |
| ENSG00000197249 | SERPINA1 | 5265 | The protein encoded by this gene is secreted and is a serine protease inhibitor whose targets include elastase, plasmin, thrombin, trypsin, chymotrypsin, and plasminogen activator. Defects in this gene can cause emphysema or liver disease. Several transcript variants encoding the same protein have been found for this gene. | serpin family A member 1 | NA |
| ENSG00000112139 | MDGA1 | 266727 | NA | MAM domain containing glycosylphosphatidylinositol anchor 1 | NA |
| ENSG00000118985 | ELL2 | 22936 | NA | elongation factor for RNA polymerase II 2 | NA |
| ENSG00000118194 | TNNT2 | 7139 | The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. Mutations in this gene have been associated with familial hypertrophic cardiomyopathy as well as with dilated cardiomyopathy. Transcripts for this gene undergo alternative splicing that results in many tissue-specific isoforms, however, the full-length nature of some of these variants has not yet been determined. | troponin T2, cardiac type | NA |
| ENSG00000129521 | EGLN3 | 112399 | NA | egl-9 family hypoxia inducible factor 3 | NA |
| ENSG00000011105 | TSPAN9 | 10867 | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. Alternatively spliced transcripts encoding the same protein have been identified. | tetraspanin 9 | NA |
| ENSG00000158516 | CPA2 | 1358 | Three different forms of human pancreatic procarboxypeptidase A have been isolated. The encoded protein represents the A2 form, which is a monomeric protein with different biochemical properties from the A1 and A3 forms. The A2 form of pancreatic procarboxypeptidase acts on aromatic C-terminal residues and is a secreted protein. | carboxypeptidase A2 | NA |
| ENSG00000169604 | ANTXR1 | 84168 | This gene encodes a type I transmembrane protein and is a tumor-specific endothelial marker that has been implicated in colorectal cancer. The encoded protein has been shown to also be a docking protein or receptor for Bacillus anthracis toxin, the causative agent of the disease, anthrax. The binding of the protective antigen (PA) component, of the tripartite anthrax toxin, to this receptor protein mediates delivery of toxin components to the cytosol of cells. Once inside the cell, the other two components of anthrax toxin, edema factor (EF) and lethal factor (LF) disrupt normal cellular processes. Three alternatively spliced variants that encode different protein isoforms have been described. | anthrax toxin receptor 1 | NA |
| ENSG00000198467 | TPM2 | 7169 | This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | tropomyosin 2 (beta) | NA |
| ENSG00000163346 | PBXIP1 | 57326 | The protein encoded by this gene interacts with the PBX1 homeodomain protein, inhibiting its transcriptional activation potential by preventing its binding to DNA. The encoded protein, which is primarily cytosolic but can shuttle to the nucleus, also can interact with estrogen receptors alpha and beta and promote the proliferation of breast cancer, brain tumors, and lung cancer. Several transcript variants encoding different isoforms have been found for this gene. More variants exist, but their full-length natures have yet to be determined. | PBX homeobox interacting protein 1 | NA |
| ENSG00000211890 | IGHA2 | ENSG00000211890 | NA | immunoglobulin heavy constant alpha 2 (A2m marker) | NA |
| ENSG00000118257 | NRP2 | 8828 | This gene encodes a member of the neuropilin family of receptor proteins. The encoded transmembrane protein binds to SEMA3C protein {sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3C} and SEMA3F protein {sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3F}, and interacts with vascular endothelial growth factor (VEGF). This protein may play a role in cardiovascular development, axon guidance, and tumorigenesis. Multiple transcript variants encoding distinct isoforms have been identified for this gene. | neuropilin 2 | NA |
| ENSG00000125618 | PAX8 | 7849 | This gene encodes a member of the paired box (PAX) family of transcription factors. Members of this gene family typically encode proteins that contain a paired box domain, an octapeptide, and a paired-type homeodomain. This nuclear protein is involved in thyroid follicular cell development and expression of thyroid-specific genes. Mutations in this gene have been associated with thyroid dysgenesis, thyroid follicular carcinomas and atypical follicular thyroid adenomas. Alternatively spliced transcript variants encoding different isoforms have been described. | paired box 8 | NA |
| ENSG00000225972 | MTND1P23 | ENSG00000225972 | NA | mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 1 pseudogene 23 | NA |
| ENSG00000109846 | CRYAB | 1410 | Mammalian lens crystallins are divided into alpha, beta, and gamma families. Alpha crystallins are composed of two gene products: alpha-A and alpha-B, for acidic and basic, respectively. Alpha crystallins can be induced by heat shock and are members of the small heat shock protein (HSP20) family. They act as molecular chaperones although they do not renature proteins and release them in the fashion of a true chaperone; instead they hold them in large soluble aggregates. Post-translational modifications decrease the ability to chaperone. These heterogeneous aggregates consist of 30-40 subunits; the alpha-A and alpha-B subunits have a 3:1 ratio, respectively. Two additional functions of alpha crystallins are an autokinase activity and participation in the intracellular architecture. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. Alpha-A and alpha-B gene products are differentially expressed; alpha-A is preferentially restricted to the lens and alpha-B is expressed widely in many tissues and organs. Elevated expression of alpha-B crystallin occurs in many neurological diseases; a missense mutation cosegregated in a family with a desmin-related myopathy. Alternative splicing results in multiple transcript variants. | crystallin alpha B | NA |
| ENSG00000175206 | NPPA | 4878 | The protein encoded by this gene belongs to the natriuretic peptide family. Natriuretic peptides are implicated in the control of extracellular fluid volume and electrolyte homeostasis. This protein is synthesized as a large precursor (containing a signal peptide), which is processed to release a peptide from the N-terminus with similarity to vasoactive peptide, cardiodilatin, and another peptide from the C-terminus with natriuretic-diuretic activity. Mutations in this gene have been associated with atrial fibrillation familial type 6. This gene is located adjacent to another member of the natriuretic family of peptides on chromosome 1. | natriuretic peptide A | NA |
| ENSG00000243955 | GSTA1 | 2938 | This gene encodes a member of a family of enzymes that function to add glutathione to target electrophilic compounds, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. This action is an important step in detoxification of these compounds. This subfamily of enzymes has a particular role in protecting cells from reactive oxygen species and the products of peroxidation. Polymorphisms in this gene influence the ability of individuals to metabolize different drugs. This gene is located in a cluster of similar genes and pseudogenes on chromosome 6. Alternative splicing results in multiple transcript variants. | glutathione S-transferase alpha 1 | NA |
| ENSG00000156113 | KCNMA1 | 3778 | MaxiK channels are large conductance, voltage and calcium-sensitive potassium channels which are fundamental to the control of smooth muscle tone and neuronal excitability. MaxiK channels can be formed by 2 subunits: the pore-forming alpha subunit, which is the product of this gene, and the modulatory beta subunit. Intracellular calcium regulates the physical association between the alpha and beta subunits. Alternatively spliced transcript variants encoding different isoforms have been identified. | potassium calcium-activated channel subfamily M alpha 1 | NA |
| ENSG00000035862 | TIMP2 | 7077 | This gene is a member of the TIMP gene family. The proteins encoded by this gene family are natural inhibitors of the matrix metalloproteinases, a group of peptidases involved in degradation of the extracellular matrix. In addition to an inhibitory role against metalloproteinases, the encoded protein has a unique role among TIMP family members in its ability to directly suppress the proliferation of endothelial cells. As a result, the encoded protein may be critical to the maintenance of tissue homeostasis by suppressing the proliferation of quiescent tissues in response to angiogenic factors, and by inhibiting protease activity in tissues undergoing remodelling of the extracellular matrix. | TIMP metallopeptidase inhibitor 2 | NA |
| ENSG00000183091 | NEB | 4703 | This gene encodes nebulin, a giant protein component of the cytoskeletal matrix that coexists with the thick and thin filaments within the sarcomeres of skeletal muscle. In most vertebrates, nebulin accounts for 3 to 4% of the total myofibrillar protein. The encoded protein contains approximately 30-amino acid long modules that can be classified into 7 types and other repeated modules. Protein isoform sizes vary from 600 to 800 kD due to alternative splicing that is tissue-, species-,and developmental stage-specific. Of the 183 exons in the nebulin gene, at least 43 are alternatively spliced, although exons 143 and 144 are not found in the same transcript. Of the several thousand transcript variants predicted for nebulin, the RefSeq Project has decided to create three representative RefSeq records. Mutations in this gene are associated with recessive nemaline myopathy. | nebulin | NA |
| ENSG00000112378 | PERP | 64065 | NA | PERP, TP53 apoptosis effector | NA |
| ENSG00000198624 | CCDC69 | 26112 | NA | coiled-coil domain containing 69 | NA |
| ENSG00000234961 | RP11-124N14.3 | ENSG00000234961 | NA | NA | NA |
| ENSG00000142173 | COL6A2 | 1292 | This gene encodes one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The product of this gene contains several domains similar to von Willebrand Factor type A domains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in this gene are associated with Bethlem myopathy and Ullrich scleroatonic muscular dystrophy. Three transcript variants have been identified for this gene. | collagen type VI alpha 2 | NA |
| ENSG00000091490 | SEL1L3 | 23231 | NA | SEL1L family member 3 | NA |
| ENSG00000163220 | S100A9 | 6280 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and altered expression of this protein is associated with the disease cystic fibrosis. This antimicrobial protein exhibits antifungal and antibacterial activity. | S100 calcium binding protein A9 | NA |
| ENSG00000263335 | AF001548.5 | ENSG00000263335 | NA | NA | NA |
| ENSG00000187288 | CIDEC | 63924 | This gene encodes a member of the cell death-inducing DNA fragmentation factor-like effector family. Members of this family play important roles in apoptosis. The encoded protein promotes lipid droplet formation in adipocytes and may mediate adipocyte apoptosis. This gene is regulated by insulin and its expression is positively correlated with insulin sensitivity. Mutations in this gene may contribute to insulin resistant diabetes. A pseudogene of this gene is located on the short arm of chromosome 3. Alternatively spliced transcript variants that encode different isoforms have been observed for this gene. | cell death inducing DFFA like effector c | NA |
| ENSG00000133065 | SLC41A1 | 254428 | NA | solute carrier family 41 member 1 | NA |
| ENSG00000147465 | STAR | 6770 | The protein encoded by this gene plays a key role in the acute regulation of steroid hormone synthesis by enhancing the conversion of cholesterol into pregnenolone. This protein permits the cleavage of cholesterol into pregnenolone by mediating the transport of cholesterol from the outer mitochondrial membrane to the inner mitochondrial membrane. Mutations in this gene are a cause of congenital lipoid adrenal hyperplasia (CLAH), also called lipoid CAH. A pseudogene of this gene is located on chromosome 13. | steroidogenic acute regulatory protein | NA |
| ENSG00000008394 | MGST1 | 4257 | The MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism) family consists of six human proteins, two of which are involved in the production of leukotrienes and prostaglandin E, important mediators of inflammation. Other family members, demonstrating glutathione S-transferase and peroxidase activities, are involved in cellular defense against toxic, carcinogenic, and pharmacologically active electrophilic compounds. This gene encodes a protein that catalyzes the conjugation of glutathione to electrophiles and the reduction of lipid hydroperoxides. This protein is localized to the endoplasmic reticulum and outer mitochondrial membrane where it is thought to protect these membranes from oxidative stress. Several transcript variants, some non-protein coding and some protein coding, have been found for this gene. | microsomal glutathione S-transferase 1 | NA |
| ENSG00000109099 | PMP22 | 5376 | This gene encodes an integral membrane protein that is a major component of myelin in the peripheral nervous system. Studies suggest two alternately used promoters drive tissue-specific expression. Various mutations of this gene are causes of Charcot-Marie-Tooth disease Type IA, Dejerine-Sottas syndrome, and hereditary neuropathy with liability to pressure palsies. Alternative splicing results in multiple transcript variants. | peripheral myelin protein 22 | NA |
| ENSG00000078114 | NEBL | 10529 | This gene encodes a nebulin like protein that is abundantly expressed in cardiac muscle. The encoded protein binds actin and interacts with thin filaments and Z-line associated proteins in striated muscle. This protein may be involved in cardiac myofibril assembly. A shorter isoform of this protein termed LIM nebulette is expressed in non-muscle cells and may function as a component of focal adhesion complexes. Alternate splicing results in multiple transcript variants. | nebulette | NA |
| ENSG00000147526 | TACC1 | 6867 | This locus may represent a breast cancer candidate gene. It is located close to FGFR1 on a region of chromosome 8 that is amplified in some breast cancers. Three transcript variants encoding different isoforms have been found for this gene. | transforming acidic coiled-coil containing protein 1 | NA |
| ENSG00000136999 | NOV | 4856 | The protein encoded by this gene is a small secreted cysteine-rich protein and a member of the CCN family of regulatory proteins. CNN family proteins associate with the extracellular matrix and play an important role in cardiovascular and skeletal development, fibrosis and cancer development. | nephroblastoma overexpressed | NA |
| ENSG00000197747 | S100A10 | 6281 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in exocytosis and endocytosis. | S100 calcium binding protein A10 | NA |
| ENSG00000179218 | CALR | 811 | Calreticulin is a multifunctional protein that acts as a major Ca(2+)-binding (storage) protein in the lumen of the endoplasmic reticulum. It is also found in the nucleus, suggesting that it may have a role in transcription regulation. Calreticulin binds to the synthetic peptide KLGFFKR, which is almost identical to an amino acid sequence in the DNA-binding domain of the superfamily of nuclear receptors. Calreticulin binds to antibodies in certain sera of systemic lupus and Sjogren patients which contain anti-Ro/SSA antibodies, it is highly conserved among species, and it is located in the endoplasmic and sarcoplasmic reticulum where it may bind calcium. The amino terminus of calreticulin interacts with the DNA-binding domain of the glucocorticoid receptor and prevents the receptor from binding to its specific glucocorticoid response element. Calreticulin can inhibit the binding of androgen receptor to its hormone-responsive DNA element and can inhibit androgen receptor and retinoic acid receptor transcriptional activities in vivo, as well as retinoic acid-induced neuronal differentiation. Thus, calreticulin can act as an important modulator of the regulation of gene transcription by nuclear hormone receptors. Systemic lupus erythematosus is associated with increased autoantibody titers against calreticulin but calreticulin is not a Ro/SS-A antigen. Earlier papers referred to calreticulin as an Ro/SS-A antigen but this was later disproven. Increased autoantibody titer against human calreticulin is found in infants with complete congenital heart block of both the IgG and IgM classes. | calreticulin | NA |
| ENSG00000111245 | MYL2 | 4633 | Thus gene encodes the regulatory light chain associated with cardiac myosin beta (or slow) heavy chain. Ca+ triggers the phosphorylation of regulatory light chain that in turn triggers contraction. Mutations in this gene are associated with mid-left ventricular chamber type hypertrophic cardiomyopathy. | myosin light chain 2 | NA |
| ENSG00000256545 | NA | NA | NA | NA | TRUE |
| ENSG00000229344 | MTCO2P12 | ENSG00000229344 | NA | MT-CO2 pseudogene 12 | NA |
| ENSG00000058668 | ATP2B4 | 493 | The protein encoded by this gene belongs to the family of P-type primary ion transport ATPases characterized by the formation of an aspartyl phosphate intermediate during the reaction cycle. These enzymes remove bivalent calcium ions from eukaryotic cells against very large concentration gradients and play a critical role in intracellular calcium homeostasis. The mammalian plasma membrane calcium ATPase isoforms are encoded by at least four separate genes and the diversity of these enzymes is further increased by alternative splicing of transcripts. The expression of different isoforms and splice variants is regulated in a developmental, tissue- and cell type-specific manner, suggesting that these pumps are functionally adapted to the physiological needs of particular cells and tissues. This gene encodes the plasma membrane calcium ATPase isoform 4. Alternatively spliced transcript variants encoding different isoforms have been identified. | ATPase plasma membrane Ca2+ transporting 4 | NA |
| ENSG00000180139 | ACTA2-AS1 | ENSG00000180139 | NA | ACTA2 antisense RNA 1 | NA |
| ENSG00000135821 | GLUL | 2752 | The protein encoded by this gene belongs to the glutamine synthetase family. It catalyzes the synthesis of glutamine from glutamate and ammonia in an ATP-dependent reaction. This protein plays a role in ammonia and glutamate detoxification, acid-base homeostasis, cell signaling, and cell proliferation. Glutamine is an abundant amino acid, and is important to the biosynthesis of several amino acids, pyrimidines, and purines. Mutations in this gene are associated with congenital glutamine deficiency, and overexpression of this gene was observed in some primary liver cancer samples. There are six pseudogenes of this gene found on chromosomes 2, 5, 9, 11, and 12. Alternative splicing results in multiple transcript variants. | glutamate-ammonia ligase | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",11,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[12,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| summary | query | name | X_id | symbol | notfound |
|---|---|---|---|---|---|
| This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | ENSG00000115414 | fibronectin 1 | 2335 | FN1 | NA |
| This gene encodes a member of the intermediate filament family. Intermediate filamentents, along with microtubules and actin microfilaments, make up the cytoskeleton. The protein encoded by this gene is responsible for maintaining cell shape, integrity of the cytoplasm, and stabilizing cytoskeletal interactions. It is also involved in the immune response, and controls the transport of low-density lipoprotein (LDL)-derived cholesterol from a lysosome to the site of esterification. It functions as an organizer of a number of critical proteins involved in attachment, migration, and cell signaling. Mutations in this gene causes a dominant, pulverulent cataract. | ENSG00000026025 | vimentin | 7431 | VIM | NA |
| This gene encodes a cysteine-rich acidic matrix-associated protein. The encoded protein is required for the collagen in bone to become calcified but is also involved in extracellular matrix synthesis and promotion of changes to cell shape. The gene product has been associated with tumor suppression but has also been correlated with metastasis based on changes to cell shape which can promote tumor cell invasion. Three transcript variants encoding different isoforms have been found for this gene. | ENSG00000113140 | secreted protein acidic and cysteine rich | 6678 | SPARC | NA |
| Synaptopodin is an actin-associated protein that may play a role in actin-based cell shape and motility. The name synaptopodin derives from the protein’s associations with postsynaptic densities and dendritic spines and with renal podocytes (Mundel et al., 1997 [PubMed 9314539]). | ENSG00000171992 | synaptopodin | 11346 | SYNPO | NA |
| NA | ENSG00000229124 | VIM antisense RNA 1 | 100507347 | VIM-AS1 | NA |
| This gene encodes a conventional non-muscle myosin; this protein should not be confused with the unconventional myosin-9a or 9b (MYO9A or MYO9B). The encoded protein is a myosin IIA heavy chain that contains an IQ domain and a myosin head-like domain which is involved in several important functions, including cytokinesis, cell motility and maintenance of cell shape. Defects in this gene have been associated with non-syndromic sensorineural deafness autosomal dominant type 17, Epstein syndrome, Alport syndrome with macrothrombocytopenia, Sebastian syndrome, Fechtner syndrome and macrothrombocytopenia with progressive sensorineural deafness. | ENSG00000100345 | myosin, heavy chain 9, non-muscle | 4627 | MYH9 | NA |
| NA | ENSG00000234961 | NA | ENSG00000234961 | RP11-124N14.3 | NA |
| The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | ENSG00000133392 | myosin, heavy chain 11, smooth muscle | 4629 | MYH11 | NA |
| The protein encoded by this gene belongs to the family of latent transforming growth factor (TGF)-beta binding proteins (LTBP), which are extracellular matrix proteins with multi-domain structure. This protein is the largest member of the LTBP family possessing unique regions and with most similarity to the fibrillins. It has thus been suggested that it may have multiple functions: as a member of the TGF-beta latent complex, as a structural component of microfibrils, and a role in cell adhesion. | ENSG00000119681 | latent transforming growth factor beta binding protein 2 | 4053 | LTBP2 | NA |
| This gene encodes a preproprotein that is proteolytically processed to form multiple protein products. The major encoded protein product, lactadherin, is a membrane glycoprotein that promotes phagocytosis of apoptotic cells. This protein has also been implicated in wound healing, autoimmune disease, and cancer. Lactadherin can be further processed to form a smaller cleavage product, medin, which comprises the major protein component of aortic medial amyloid (AMA). Alternative splicing results in multiple transcript variants. | ENSG00000140545 | milk fat globule-EGF factor 8 protein | 4240 | MFGE8 | NA |
| The protein encoded by this gene is a mitogen that is secreted by vascular endothelial cells. The encoded protein plays a role in chondrocyte proliferation and differentiation, cell adhesion in many cell types, and is related to platelet-derived growth factor. Certain polymorphisms in this gene have been linked with a higher incidence of systemic sclerosis. | ENSG00000118523 | connective tissue growth factor | 1490 | CTGF | NA |
| This gene encodes the third discovered human homologue of the Drosophilia melanogaster type I membrane protein notch. In Drosophilia, notch interaction with its cell-bound ligands (delta, serrate) establishes an intercellular signalling pathway that plays a key role in neural development. Homologues of the notch-ligands have also been identified in human, but precise interactions between these ligands and the human notch homologues remains to be determined. Mutations in NOTCH3 have been identified as the underlying cause of cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy (CADASIL). | ENSG00000074181 | notch 3 | 4854 | NOTCH3 | NA |
| NA | ENSG00000112096 | uncharacterized LOC100129518 | 100129518 | LOC100129518 | NA |
| This gene is a member of the iron/manganese superoxide dismutase family. It encodes a mitochondrial protein that forms a homotetramer and binds one manganese ion per subunit. This protein binds to the superoxide byproducts of oxidative phosphorylation and converts them to hydrogen peroxide and diatomic oxygen. Mutations in this gene have been associated with idiopathic cardiomyopathy (IDC), premature aging, sporadic motor neuron disease, and cancer. Alternative splicing of this gene results in multiple transcript variants. A related pseudogene has been identified on chromosome 1. | ENSG00000112096 | superoxide dismutase 2, mitochondrial | 6648 | SOD2 | NA |
| The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and altered expression of this protein is associated with the disease cystic fibrosis. This antimicrobial protein exhibits antifungal and antibacterial activity. | ENSG00000163220 | S100 calcium binding protein A9 | 6280 | S100A9 | NA |
| This gene encodes a protein involved in glycolysis. The encoded protein is a pyruvate kinase that catalyzes the transfer of a phosphoryl group from phosphoenolpyruvate to ADP, generating ATP and pyruvate. This protein has been shown to interact with thyroid hormone and may mediate cellular metabolic effects induced by thyroid hormones. This protein has been found to bind Opa protein, a bacterial outer membrane protein involved in gonococcal adherence to and invasion of human cells, suggesting a role of this protein in bacterial pathogenesis. Several alternatively spliced transcript variants encoding a few distinct isoforms have been reported. | ENSG00000067225 | pyruvate kinase, muscle | 5315 | PKM | NA |
| This gene is a member of the TIMP gene family. The proteins encoded by this gene family are natural inhibitors of the matrix metalloproteinases, a group of peptidases involved in degradation of the extracellular matrix. In addition to an inhibitory role against metalloproteinases, the encoded protein has a unique role among TIMP family members in its ability to directly suppress the proliferation of endothelial cells. As a result, the encoded protein may be critical to the maintenance of tissue homeostasis by suppressing the proliferation of quiescent tissues in response to angiogenic factors, and by inhibiting protease activity in tissues undergoing remodelling of the extracellular matrix. | ENSG00000035862 | TIMP metallopeptidase inhibitor 2 | 7077 | TIMP2 | NA |
| Plectin is a prominent member of an important family of structurally and in part functionally related proteins, termed plakins or cytolinkers, that are capable of interlinking different elements of the cytoskeleton. Plakins, with their multi-domain structure and enormous size, not only play crucial roles in maintaining cell and tissue integrity and orchestrating dynamic changes in cytoarchitecture and cell shape, but also serve as scaffolding platforms for the assembly, positioning, and regulation of signaling complexes (reviewed in PMID: 9701547, 11854008, and 17499243). Plectin is expressed as several protein isoforms in a wide range of cell types and tissues from a single gene located on chromosome 8 in humans (PMID: 8633055, 8698233). Until 2010, this locus was named plectin 1 (symbol PLEC1 in human; Plec1 in mouse and rat) and the gene product had been referred to as ‘hemidesmosomal protein 1’ or ‘plectin 1, intermediate filament binding 500kDa’. These names were superseded by plectin. The plectin gene locus in mouse on chromosome 15 has been analyzed in detail (PMID: 10556294, 14559777), revealing a genomic exon-intron organization with well over 40 exons spanning over 62 kb and an unusual 5’ transcript complexity of plectin isoforms. Eleven exons (1-1j) have been identified that alternatively splice directly into a common exon 2 which is the first exon to encode plectin’s highly conserved actin binding domain (ABD). Three additional exons (-1, 0a, and 0) splice into an alternative first coding exon (1c), and two additional exons (2alpha and 3alpha) are optionally spliced within the exons encoding the acting binding domain (exons 2-8). Analysis of the human locus has identified eight of the eleven alternative 5’ exons found in mouse and rat (PMID: 14672974); exons 1i, 1j and 1h have not been confirmed in human. Furthermore, isoforms lacking the central rod domain encoded by exon 31 have been detected in mouse (PMID:10556294), rat (PMID: 9177781), and human (PMID: 11441066, 10780662, 20052759). The short alternative amino-terminal sequences encoded by the different first exons direct the targeting of the various isoforms to distinct subcellular locations (PMID: 14559777). As the expression of specific plectin isoforms was found to be dependent on cell type (tissue) and stage of development (PMID: 10556294, 12542521, 17389230) it appears that each cell type (tissue) contains a unique set (proportion and composition) of plectin isoforms, as if custom-made for specific requirements of the particular cells. Concordantly, individual isoforms were found to carry out distinct and specific functions (PMID: 14559777, 12542521, 18541706). In 1996, a number of groups reported that patients suffering from epidermolysis bullosa simplex with muscular dystrophy (EBS-MD) lacked plectin expression in skin and muscle tissues due to defects in the plectin gene (PMID: 8698233, 8941634, 8636409, 8894687, 8696340). Two other subtypes of plectin-related EBS have been described: EBS-pyloric atresia (PA) and EBS-Ogna. For reviews of plectin-related diseases see PMID: 15810881, 19945614. Mutations in the plectin gene related to human diseases should be named based on the position in NM_000445 (variant 1, isoform 1c), unless the mutation is located within one of the other alternative first exons, in which case the position in the respective Reference Sequence should be used. | ENSG00000178209 | plectin | 5339 | PLEC | NA |
| The protein encoded by this gene binds transforming growth factor beta (TGFB) as it is secreted and targeted to the extracellular matrix. TGFB is biologically latent after secretion and insertion into the extracellular matrix, and sheds TGFB and other proteins upon activation. Defects in this gene may be a cause of cutis laxa and severe pulmonary, gastrointestinal, and urinary abnormalities. Three transcript variants encoding different isoforms have been found for this gene. | ENSG00000090006 | latent transforming growth factor beta binding protein 4 | 8425 | LTBP4 | NA |
| This gene encodes a protein that is a member of the dickkopf family. The secreted protein contains two cysteine rich regions and is involved in embryonic development through its interactions with the Wnt signaling pathway. The expression of this gene is decreased in a variety of cancer cell lines and it may function as a tumor suppressor gene. Alternative splicing results in multiple transcript variants encoding the same protein. | ENSG00000050165 | dickkopf WNT signaling pathway inhibitor 3 | 27122 | DKK3 | NA |
| This gene encodes a member of the insulin-like growth factor (IGF)-binding protein (IGFBP) family. IGFBPs bind IGFs with high affinity, and regulate IGF availability in body fluids and tissues and modulate IGF binding to its receptors. This protein binds IGF-I and IGF-II with relatively low affinity, and belongs to a subfamily of low-affinity IGFBPs. It also stimulates prostacyclin production and cell adhesion. Alternatively spliced transcript variants encoding different isoforms have been described for this gene, and one variant has been associated with retinal arterial macroaneurysm (PMID:21835307). | ENSG00000163453 | insulin like growth factor binding protein 7 | 3490 | IGFBP7 | NA |
| This gene encodes a gamma-carboxyglutamic acid (Gla)-containing protein thought to be involved in the stimulation of cell proliferation. This gene is frequently overexpressed in many cancers and has been implicated as an adverse prognostic marker. Elevated protein levels are additionally associated with a variety of disease states, including venous thromboembolic disease, systemic lupus erythematosus, chronic renal failure, and preeclampsia. | ENSG00000183087 | growth arrest specific 6 | 2621 | GAS6 | NA |
| This gene encodes a protein that is one of the two components of elastic fibers. The encoded protein is rich in hydrophobic amino acids such as glycine and proline, which form mobile hydrophobic regions bounded by crosslinks between lysine residues. Deletions and mutations in this gene are associated with supravalvular aortic stenosis (SVAS) and autosomal dominant cutis laxa. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000049540 | elastin | 2006 | ELN | NA |
| The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and as a cytokine. Altered expression of this protein is associated with the disease cystic fibrosis. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000143546 | S100 calcium binding protein A8 | 6279 | S100A8 | NA |
| This gene encodes a member of carboxypeptidase A protein family. The encoded protein may function as a transcriptional repressor and play a role in adipogenesis and smooth muscle cell differentiation. Studies in mice suggest that this gene functions in wound healing and abdominal wall development. Overexpression of this gene is associated with glioblastoma. | ENSG00000106624 | AE binding protein 1 | 165 | AEBP1 | NA |
| The protein encoded by this gene is a secreted, extracellular matrix protein containing an Arg-Gly-Asp (RGD) motif and calcium-binding EGF-like domains. It promotes adhesion of endothelial cells through interaction of integrins and the RGD motif. It is prominently expressed in developing arteries but less so in adult vessels. However, its expression is reinduced in balloon-injured vessels and atherosclerotic lesions, notably in intimal vascular smooth muscle cells and endothelial cells. Therefore, the protein encoded by this gene may play a role in vascular development and remodeling. Defects in this gene are a cause of autosomal dominant cutis laxa, autosomal recessive cutis laxa type I (CL type I), and age-related macular degeneration type 3 (ARMD3). | ENSG00000140092 | fibulin 5 | 10516 | FBLN5 | NA |
| The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | ENSG00000171401 | keratin 13 | 3860 | KRT13 | NA |
| The product encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Mutations in this gene cause nemaline myopathy type 3, congenital myopathy with excess of thin myofilaments, congenital myopathy with cores, and congenital myopathy with fiber-type disproportion, diseases that lead to muscle fiber defects. | ENSG00000143632 | actin, alpha 1, skeletal muscle | 58 | ACTA1 | NA |
| This gene encodes a putative transcription factor with two LIM zinc-binding domains. The encoded protein may participate in the differentiation of smooth muscle tissue. Alternative splicing results in multiple transcript variants. | ENSG00000182809 | cysteine rich protein 2 | 1397 | CRIP2 | NA |
| This gene encodes a member of the glyceraldehyde-3-phosphate dehydrogenase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. The product of this gene catalyzes an important energy-yielding step in carbohydrate metabolism, the reversible oxidative phosphorylation of glyceraldehyde-3-phosphate in the presence of inorganic phosphate and nicotinamide adenine dinucleotide (NAD). The encoded protein has additionally been identified to have uracil DNA glycosylase activity in the nucleus. Also, this protein contains a peptide that has antimicrobial activity against E. coli, P. aeruginosa, and C. albicans. Studies of a similar protein in mouse have assigned a variety of additional functions including nitrosylation of nuclear proteins, the regulation of mRNA stability, and acting as a transferrin receptor on the cell surface of macrophage. Many pseudogenes similar to this locus are present in the human genome. Alternative splicing results in multiple transcript variants. | ENSG00000111640 | glyceraldehyde-3-phosphate dehydrogenase | 2597 | GAPDH | NA |
| This gene encodes a component of high density lipoprotein that has no marked similarity to other apolipoprotein sequences. It has a high degree of homology to plasma retinol-binding protein and other members of the alpha 2 microglobulin protein superfamily of carrier proteins, also known as lipocalins. This glycoprotein is closely associated with the enzyme lecithin:cholesterol acyltransferase - an enzyme involved in lipoprotein metabolism. | ENSG00000189058 | apolipoprotein D | 347 | APOD | NA |
| The protein encoded by this gene is a leucine-rich repeat protein present in connective tissue extracellular matrix. This protein functions as a molecule anchoring basement membranes to the underlying connective tissue. This protein has been shown to bind type I collagen to basement membranes and type II collagen to cartilage. It also binds the basement membrane heparan sulfate proteoglycan perlecan. This protein is suggested to be involved in the pathogenesis of Hutchinson-Gilford progeria (HGP), which is reported to lack the binding of collagen in basement membranes and cartilage. Alternatively spliced transcript variants encoding the same protein have been observed. | ENSG00000188783 | proline and arginine rich end leucine rich repeat protein | 5549 | PRELP | NA |
| The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in motility, invasion, and tubulin polymerization. Chromosomal rearrangements and altered expression of this gene have been implicated in tumor metastasis. Multiple alternatively spliced variants, encoding the same protein, have been identified. | ENSG00000196154 | S100 calcium binding protein A4 | 6275 | S100A4 | NA |
| The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in stimulation of Ca2+-dependent insulin release, stimulation of prolactin secretion, and exocytosis. Chromosomal rearrangements and altered expression of this gene have been implicated in melanoma. | ENSG00000197956 | S100 calcium binding protein A6 | 6277 | S100A6 | NA |
| NA | ENSG00000129353 | solute carrier family 44 member 2 | 57153 | SLC44A2 | NA |
| This gene encodes a highly conserved preproprotein that is proteolytically processed to generate four main cleavage products including saposins A, B, C, and D. Each domain of the precursor protein is approximately 80 amino acid residues long with nearly identical placement of cysteine residues and glycosylation sites. Saposins A-D localize primarily to the lysosomal compartment where they facilitate the catabolism of glycosphingolipids with short oligosaccharide groups. The precursor protein exists both as a secretory protein and as an integral membrane protein and has neurotrophic activities. Mutations in this gene have been associated with Gaucher disease and metachromatic leukodystrophy. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that is proteolytically processed. | ENSG00000197746 | prosaposin | 5660 | PSAP | NA |
| This gene is specifically expressed in Schwann cells of the peripheral nervous system and encodes a type I transmembrane glycoprotein that is a major structural protein of the peripheral myelin sheath. The encoded protein contains a large hydrophobic extracellular domain and a smaller basic intracellular domain, which are essential for the formation and stabilization of the multilamellar structure of the compact myelin. Mutations in this gene are associated with autosomal dominant form of Charcot-Marie-Tooth disease type 1 (CMT1B) and other polyneuropathies, such as Dejerine-Sottas syndrome (DSS) and congenital hypomyelinating neuropathy (CHN). A recent study showed that two isoforms are produced from the same mRNA by use of alternative in-frame translation termination codons via a stop codon readthrough mechanism. | ENSG00000158887 | myelin protein zero | 4359 | MPZ | NA |
| This gene encodes the perlecan protein, which consists of a core protein to which three long chains of glycosaminoglycans (heparan sulfate or chondroitin sulfate) are attached. The perlecan protein is a large multidomain proteoglycan that binds to and cross-links many extracellular matrix components and cell-surface molecules. It has been shown that this protein interacts with laminin, prolargin, collagen type IV, FGFBP1, FBLN2, FGF7 and transthyretin, etc., and it plays essential roles in multiple biological activities. Perlecan is a key component of the vascular extracellular matrix, where it helps to maintain the endothelial barrier function. It is a potent inhibitor of smooth muscle cell proliferation and is thus thought to help maintain vascular homeostasis. It can also promote growth factor (e.g., FGF2) activity and thus stimulate endothelial growth and re-generation. It is a major component of basement membranes, where it is involved in the stabilization of other molecules as well as being involved with glomerular permeability to macromolecules and cell adhesion. Mutations in this gene cause Schwartz-Jampel syndrome type 1, Silverman-Handmaker type of dyssegmental dysplasia, and tardive dyskinesia. Alternative splicing of this gene results in multiple transcript variants. | ENSG00000142798 | heparan sulfate proteoglycan 2 | 3339 | HSPG2 | NA |
| NA | ENSG00000117289 | NA | NA | NA | TRUE |
| This gene encodes a member of the WNT1 inducible signaling pathway (WISP) protein subfamily, which belongs to the connective tissue growth factor (CTGF) family. WNT1 is a member of a family of cysteine-rich, glycosylated signaling proteins that mediate diverse developmental processes. The CTGF family members are characterized by four conserved cysteine-rich domains: insulin-like growth factor-binding domain, von Willebrand factor type C module, thrombospondin domain and C-terminal cystine knot-like (CT) domain. The encoded protein lacks the CT domain which is implicated in dimerization and heparin binding. It is 72% identical to the mouse protein at the amino acid level. This gene may be downstream in the WNT1 signaling pathway that is relevant to malignant transformation. Its expression in colon tumors is reduced while the other two WISP members are overexpressed in colon tumors. It is expressed at high levels in bone tissue, and may play an important role in modulating bone turnover. | ENSG00000064205 | WNT1 inducible signaling pathway protein 2 | 8839 | WISP2 | NA |
| NA | ENSG00000124942 | AHNAK nucleoprotein | 79026 | AHNAK | NA |
| Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | ENSG00000163017 | actin, gamma 2, smooth muscle, enteric | 72 | ACTG2 | NA |
| This gene encodes the pro-alpha2 chain of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIB, recessive Ehlers-Danlos syndrome Classical type, idiopathic osteoporosis, and atypical Marfan syndrome. Symptoms associated with mutations in this gene, however, tend to be less severe than mutations in the gene for the alpha1 chain of type I collagen (COL1A1) reflecting the different role of alpha2 chains in matrix integrity. Three transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | ENSG00000164692 | collagen type I alpha 2 chain | 1278 | COL1A2 | NA |
| This gene is a member of the aggrecan/versican proteoglycan family. The protein encoded is a large chondroitin sulfate proteoglycan and is a major component of the extracellular matrix. This protein is involved in cell adhesion, proliferation, proliferation, migration and angiogenesis and plays a central role in tissue morphogenesis and maintenance. Mutations in this gene are the cause of Wagner syndrome type 1. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000038427 | versican | 1462 | VCAN | NA |
| The protein encoded by this gene is a major non-neuronal microtubule-associated protein. This protein contains a domain similar to the microtubule-binding domains of neuronal microtubule-associated protein (MAP2) and microtubule-associated protein tau (MAPT/TAU). This protein promotes microtubule assembly, and has been shown to counteract destabilization of interphase microtubule catastrophe promotion. Cyclin B was found to interact with this protein, which targets cell division cycle 2 (CDC2) kinase to microtubules. The phosphorylation of this protein affects microtubule properties and cell cycle progression. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000047849 | microtubule associated protein 4 | 4134 | MAP4 | NA |
| This gene encodes a member of the fibulin family of extracellular matrix glycoproteins. Like all members of this family, the encoded protein contains tandemly repeated epidermal growth factor-like repeats followed by a C-terminus fibulin-type domain. This gene is upregulated in malignant gliomas and may play a role in the aggressive nature of these tumors. Mutations in this gene are associated with Doyne honeycomb retinal dystrophy. Alternatively spliced transcript variants that encode the same protein have been described. | ENSG00000115380 | EGF containing fibulin like extracellular matrix protein 1 | 2202 | EFEMP1 | NA |
| APM2 gene is exclusively expressed in adipose tissue. Its function is currently unknown. | ENSG00000148671 | adipogenesis regulatory factor | 10974 | ADIRF | NA |
| Fibromodulin belongs to the family of small interstitial proteoglycans. The encoded protein possesses a central region containing leucine-rich repeats with 4 keratan sulfate chains, flanked by terminal domains containing disulphide bonds. Owing to the interaction with type I and type II collagen fibrils and in vitro inhibition of fibrillogenesis, the encoded protein may play a role in the assembly of extracellular matrix. It may also regulate TGF-beta activities by sequestering TGF-beta into the extracellular matrix. Sequence variations in this gene may be associated with the pathogenesis of high myopia. Alternative splicing results in multiple transcript variants. | ENSG00000122176 | fibromodulin | 2331 | FMOD | NA |
| This gene encodes a major glucose transporter in the mammalian blood-brain barrier. The encoded protein is found primarily in the cell membrane and on the cell surface, where it can also function as a receptor for human T-cell leukemia virus (HTLV) I and II. Mutations in this gene have been found in a family with paroxysmal exertion-induced dyskinesia. | ENSG00000117394 | solute carrier family 2 member 1 | 6513 | SLC2A1 | NA |
| This gene encodes cytochrome b5 reductase, which includes a membrane-bound form in somatic cells (anchored in the endoplasmic reticulum, mitochondrial and other membranes) and a soluble form in erythrocytes. The membrane-bound form exists mainly on the cytoplasmic side of the endoplasmic reticulum and functions in desaturation and elongation of fatty acids, in cholesterol biosynthesis, and in drug metabolism. The erythrocyte form is located in a soluble fraction of circulating erythrocytes and is involved in methemoglobin reduction. The membrane-bound form has both membrane-binding and catalytic domains, while the soluble form has only the catalytic domain. Alternate splicing results in multiple transcript variants. Mutations in this gene cause methemoglobinemias. | ENSG00000100243 | cytochrome b5 reductase 3 | 1727 | CYB5R3 | NA |
| The protein encoded by this gene belongs to the thrombospondin family. It is a disulfide-linked homotrimeric glycoprotein that mediates cell-to-cell and cell-to-matrix interactions. This protein has been shown to function as a potent inhibitor of tumor growth and angiogenesis. Studies of the mouse counterpart suggest that this protein may modulate the cell surface properties of mesenchymal cells and be involved in cell adhesion and migration. | ENSG00000186340 | thrombospondin 2 | 7058 | THBS2 | NA |
| Mammalian lens crystallins are divided into alpha, beta, and gamma families. Alpha crystallins are composed of two gene products: alpha-A and alpha-B, for acidic and basic, respectively. Alpha crystallins can be induced by heat shock and are members of the small heat shock protein (HSP20) family. They act as molecular chaperones although they do not renature proteins and release them in the fashion of a true chaperone; instead they hold them in large soluble aggregates. Post-translational modifications decrease the ability to chaperone. These heterogeneous aggregates consist of 30-40 subunits; the alpha-A and alpha-B subunits have a 3:1 ratio, respectively. Two additional functions of alpha crystallins are an autokinase activity and participation in the intracellular architecture. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. Alpha-A and alpha-B gene products are differentially expressed; alpha-A is preferentially restricted to the lens and alpha-B is expressed widely in many tissues and organs. Elevated expression of alpha-B crystallin occurs in many neurological diseases; a missense mutation cosegregated in a family with a desmin-related myopathy. Alternative splicing results in multiple transcript variants. | ENSG00000109846 | crystallin alpha B | 1410 | CRYAB | NA |
| This gene encodes a member of the serine proteinase inhibitor (serpin) superfamily. This member is the principal inhibitor of tissue plasminogen activator (tPA) and urokinase (uPA), and hence is an inhibitor of fibrinolysis. Defects in this gene are the cause of plasminogen activator inhibitor-1 deficiency (PAI-1 deficiency), and high concentrations of the gene product are associated with thrombophilia. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ENSG00000106366 | serpin family E member 1 | 5054 | SERPINE1 | NA |
| This gene encodes a protein involved in peripheral nerve myelin upkeep. The encoded protein contains 2 PDZ domains which were named after PSD95 (post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1 (a mammalian tight junction protein). Two alternatively spliced transcript variants have been described for this gene which encode different protein isoforms and which are targeted differently in the Schwann cell. Mutations in this gene cause Charcot-Marie-Tooth neuoropathy, type 4F and Dejerine-Sottas neuropathy. | ENSG00000105227 | periaxin | 57716 | PRX | NA |
| NA | ENSG00000176658 | myosin ID | 4642 | MYO1D | NA |
| Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a nonmuscle, alpha actinin isoform which is concentrated in the cytoplasm, and thought to be involved in metastatic processes. Mutations in this gene have been associated with focal and segmental glomerulosclerosis. | ENSG00000130402 | actinin alpha 4 | 81 | ACTN4 | NA |
| NA | ENSG00000119280 | chromosome 1 open reading frame 198 | 84886 | C1orf198 | NA |
| The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | ENSG00000170477 | keratin 4 | 3851 | KRT4 | NA |
| NA | ENSG00000082781 | integrin subunit beta 5 | 3693 | ITGB5 | NA |
| NA | ENSG00000167779 | insulin like growth factor binding protein 6 | 3489 | IGFBP6 | NA |
| NA | ENSG00000099994 | sushi domain containing 2 | 56241 | SUSD2 | NA |
| This gene encodes a receptor for inositol 1,4,5-trisphosphate, a second messenger that mediates the release of intracellular calcium. The receptor contains a calcium channel at the C-terminus and the ligand-binding site at the N-terminus. Knockout studies in mice suggest that type 2 and type 3 inositol 1,4,5-trisphosphate receptors play a key role in exocrine secretion underlying energy metabolism and growth. | ENSG00000096433 | inositol 1,4,5-trisphosphate receptor type 3 | 3710 | ITPR3 | NA |
| This gene encodes a member of the A1 family of peptidases. The encoded preproprotein is proteolytically processed to generate multiple protein products. These products include the cathepsin D light and heavy chains, which heterodimerize to form the mature enzyme. This enzyme exhibits pepsin-like activity and plays a role in protein turnover and in the proteolytic activation of hormones and growth factors. Mutations in this gene play a causal role in neuronal ceroid lipofuscinosis-10 and may be involved in the pathogenesis of several other diseases, including breast cancer and possibly Alzheimer’s disease. | ENSG00000117984 | cathepsin D | 1509 | CTSD | NA |
| The protein encoded by this gene is a glutathione-independent prostaglandin D synthase that catalyzes the conversion of prostaglandin H2 (PGH2) to postaglandin D2 (PGD2). PGD2 functions as a neuromodulator as well as a trophic factor in the central nervous system. PGD2 is also involved in smooth muscle contraction/relaxation and is a potent inhibitor of platelet aggregation. This gene is preferentially expressed in brain. Studies with transgenic mice overexpressing this gene suggest that this gene may be also involved in the regulation of non-rapid eye movement sleep. | ENSG00000107317 | prostaglandin D2 synthase | 5730 | PTGDS | NA |
| This gene is a member of the matrix metalloproteinase (MMP) gene family, that are zinc-dependent enzymes capable of cleaving components of the extracellular matrix and molecules involved in signal transduction. The protein encoded by this gene is a gelatinase A, type IV collagenase, that contains three fibronectin type II repeats in its catalytic site that allow binding of denatured type IV and V collagen and elastin. Unlike most MMP family members, activation of this protein can occur on the cell membrane. This enzyme can be activated extracellularly by proteases, or, intracellulary by its S-glutathiolation with no requirement for proteolytical removal of the pro-domain. This protein is thought to be involved in multiple pathways including roles in the nervous system, endometrial menstrual breakdown, regulation of vascularization, and metastasis. Mutations in this gene have been associated with Winchester syndrome and Nodulosis-Arthropathy-Osteolysis (NAO) syndrome. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000087245 | matrix metallopeptidase 2 | 4313 | MMP2 | NA |
| This gene encodes a member of the low-density lipoprotein receptor family of proteins. The encoded preproprotein is proteolytically processed by furin to generate 515 kDa and 85 kDa subunits that form the mature receptor (PMID: 8546712). This receptor is involved in several cellular processes, including intracellular signaling, lipid homeostasis, and clearance of apoptotic cells. In addition, the encoded protein is necessary for the alpha 2-macroglobulin-mediated clearance of secreted amyloid precursor protein and beta-amyloid, the main component of amyloid plaques found in Alzheimer patients. Expression of this gene decreases with age and has been found to be lower than controls in brain tissue from Alzheimer’s disease patients. | ENSG00000123384 | LDL receptor related protein 1 | 4035 | LRP1 | NA |
| This gene encodes a protein containing a calponin homology (CH) domain, a PDZ domain, and a LIM domain, and may be involved in protein-protein interactions. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene, however, the full-length nature of some variants is not known. | ENSG00000136153 | LIM domain 7 | 4008 | LMO7 | NA |
| This gene encodes one of the two alpha chains of type VIII collagen. The gene product is a short chain collagen and a major component of the basement membrane of the corneal endothelium. The type VIII collagen fibril can be either a homo- or a heterotrimer. Alternatively spliced transcript variants encoding the same protein have been observed. | ENSG00000144810 | collagen type VIII alpha 1 | 1295 | COL8A1 | NA |
| This gene encodes a transmembrane protein containing six cysteine-rich repeat domains and an insulin-like growth factor-binding domain. The encoded protein may play a role in tissue development though interactions with members of the transforming growth factor beta family, such as bone morphogenetic proteins. | ENSG00000150938 | cysteine rich transmembrane BMP regulator 1 (chordin-like) | 51232 | CRIM1 | NA |
| Syntrophins are cytoplasmic peripheral membrane scaffold proteins that are components of the dystrophin-associated protein complex. This gene is a member of the syntrophin gene family and encodes the most common syntrophin isoform found in cardiac tissues. The N-terminal PDZ domain of this syntrophin protein interacts with the C-terminus of the pore-forming alpha subunit (SCN5A) of the cardiac sodium channel Nav1.5. This protein also associates cardiac sodium channels with the nitric oxide synthase-PMCA4b (plasma membrane Ca-ATPase subtype 4b) complex in cardiomyocytes. This gene is a susceptibility locus for Long-QT syndrome (LQT) - an inherited disorder associated with sudden cardiac death from arrhythmia - and sudden infant death syndrome (SIDS). This protein also associates with dystrophin and dystrophin-related proteins at the neuromuscular junction and alters intracellular calcium ion levels in muscle tissue. | ENSG00000101400 | syntrophin alpha 1 | 6640 | SNTA1 | NA |
| Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a muscle-specific, alpha actinin isoform that is expressed in both skeletal and cardiac muscles. Several transcript variants encoding different isoforms have been found for this gene. | ENSG00000077522 | actinin alpha 2 | 88 | ACTN2 | NA |
| ERRFI1 is a cytoplasmic protein whose expression is upregulated with cell growth (Wick et al., 1995 [PubMed 7641805]). It shares significant homology with the protein product of rat gene-33, which is induced during cell stress and mediates cell signaling (Makkinje et al., 2000 [PubMed 10749885]; Fiorentino et al., 2000 [PubMed 11003669]). | ENSG00000116285 | ERBB receptor feedback inhibitor 1 | 54206 | ERRFI1 | NA |
| The protein encoded by this gene is a member of a family of membrane glycoproteins. This glycoprotein provides selectins with carbohydrate ligands. It may also play a role in tumor cell metastasis. | ENSG00000185896 | lysosomal associated membrane protein 1 | 3916 | LAMP1 | NA |
| This gene encodes a member of the epidermal growth factor (EGF) receptor family of receptor tyrosine kinases. This protein has no ligand binding domain of its own and therefore cannot bind growth factors. However, it does bind tightly to other ligand-bound EGF receptor family members to form a heterodimer, stabilizing ligand binding and enhancing kinase-mediated activation of downstream signalling pathways, such as those involving mitogen-activated protein kinase and phosphatidylinositol-3 kinase. Allelic variations at amino acid positions 654 and 655 of isoform a (positions 624 and 625 of isoform b) have been reported, with the most common allele, Ile654/Ile655, shown here. Amplification and/or overexpression of this gene has been reported in numerous cancers, including breast and ovarian tumors. Alternative splicing results in several additional transcript variants, some encoding different isoforms and others that have not been fully characterized. | ENSG00000141736 | erb-b2 receptor tyrosine kinase 2 | 2064 | ERBB2 | NA |
| This gene encodes a PDZ domain-containing protein. PDZ motifs are modular protein-protein interaction domains consisting of 80-120 amino acid residues. PDZ domain-containing proteins interact with each other in cytoskeletal assembly or with other proteins involved in targeting and clustering of membrane proteins. The protein encoded by this gene interacts with alpha-actinin-2 through its N-terminal PDZ domain and with protein kinase C via its C-terminal LIM domains. The LIM domain is a cysteine-rich motif defined by 50-60 amino acids containing two zinc-binding modules. This protein also interacts with all three members of the myozenin family. Mutations in this gene have been associated with myofibrillar myopathy and dilated cardiomyopathy. Alternatively spliced transcript variants encoding different isoforms have been identified; all isoforms have N-terminal PDZ domains while only longer isoforms (1, 2 and 5) have C-terminal LIM domains. | ENSG00000122367 | LIM domain binding 3 | 11155 | LDB3 | NA |
| This gene encodes a protein that catalyzes the condensation of nicotinamide with 5-phosphoribosyl-1-pyrophosphate to yield nicotinamide mononucleotide, one step in the biosynthesis of nicotinamide adenine dinucleotide. The protein belongs to the nicotinic acid phosphoribosyltransferase (NAPRTase) family and is thought to be involved in many important biological processes, including metabolism, stress response and aging. This gene has a pseudogene on chromosome 10. | ENSG00000105835 | nicotinamide phosphoribosyltransferase | 10135 | NAMPT | NA |
| The protein encoded by this gene was identified as a binding protein of the protein kinase C, delta (PRKCD). The expression of this gene in cultured cell lines is strongly induced by serum starvation. The expression of this protein was found to be down-regulated in various cancer cell lines, suggesting the possible tumor suppressor function of this protein. | ENSG00000170955 | protein kinase C delta binding protein | 112464 | PRKCDBP | NA |
| This gene encodes an integral membrane protein that is a major component of myelin in the peripheral nervous system. Studies suggest two alternately used promoters drive tissue-specific expression. Various mutations of this gene are causes of Charcot-Marie-Tooth disease Type IA, Dejerine-Sottas syndrome, and hereditary neuropathy with liability to pressure palsies. Alternative splicing results in multiple transcript variants. | ENSG00000109099 | peripheral myelin protein 22 | 5376 | PMP22 | NA |
| NA | ENSG00000151468 | coiled-coil domain containing 3 | 83643 | CCDC3 | NA |
| The membrane-associated protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intracellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the ABC1 subfamily. Members of the ABC1 subfamily comprise the only major ABC subfamily found exclusively in multicellular eukaryotes. This protein is highly expressed in brain tissue and may play a role in macrophage lipid metabolism and neural development. Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000107331 | ATP binding cassette subfamily A member 2 | 20 | ABCA2 | NA |
| The protein encoded by this gene is a member of the serpin family of proteinase inhibitors. Members of this family maintain homeostasis by neutralizing overexpressed proteinase activity through their function as suicide substrates. This protein inhibits the neutrophil-derived proteinases neutrophil elastase, cathepsin G, and proteinase-3 and thus protects tissues from damage at inflammatory sites. Alternative splicing results in multiple transcript variants. | ENSG00000021355 | serpin family B member 1 | 1992 | SERPINB1 | NA |
| Secreted frizzled-related protein 4 (SFRP4) is a member of the SFRP family that contains a cysteine-rich domain homologous to the putative Wnt-binding site of Frizzled proteins. SFRPs act as soluble modulators of Wnt signaling. The expression of SFRP4 in ventricular myocardium correlates with apoptosis related gene expression. | ENSG00000106483 | secreted frizzled related protein 4 | 6424 | SFRP4 | NA |
| NA | ENSG00000091986 | coiled-coil domain containing 80 | 151887 | CCDC80 | NA |
| This antimicrobial gene belongs to the cytokine gene family which encode secreted proteins involved in immunoregulatory and inflammatory processes. The protein encoded by this gene is structurally related to the CXC (Cys-X-Cys) subfamily of cytokines. Members of this subfamily are characterized by two cysteines separated by a single amino acid. This cytokine displays chemotactic activity for monocytes but not for lymphocytes, dendritic cells, neutrophils or macrophages. It has been implicated that this cytokine is involved in the homeostasis of monocyte-derived macrophages rather than in inflammation. | ENSG00000145824 | C-X-C motif chemokine ligand 14 | 9547 | CXCL14 | NA |
| This gene encodes a nonsarcomeric myosin regulatory light chain. This protein is activated by phosphorylation and regulates smooth muscle and non-muscle cell contraction. This protein may also be involved in DNA damage repair by sequestering the transcriptional regulator apoptosis-antagonizing transcription factor (AATF)/Che-1 which functions as a repressor of p53-driven apoptosis. Alternate splicing results in multiple transcript variants. A pseudogene of this gene is found on chromosome 8. | ENSG00000101608 | myosin light chain 12A | 10627 | MYL12A | NA |
| NA | ENSG00000136205 | tensin 3 | 64759 | TNS3 | NA |
| This gene encodes a protein that enables the dissociation of paused ternary polymerase I transcription complexes from the 3’ end of pre-rRNA transcripts. This protein regulates rRNA transcription by promoting the dissociation of transcription complexes and the reinitiation of polymerase I on nascent rRNA transcripts. This protein also localizes to caveolae at the plasma membrane and is thought to play a critical role in the formation of caveolae and the stabilization of caveolins. This protein translocates from caveolae to the cytoplasm after insulin stimulation. Caveolae contain truncated forms of this protein and may be the site of phosphorylation-dependent proteolysis. This protein is also thought to modify lipid metabolism and insulin-regulated gene expression. Mutations in this gene result in a disorder characterized by generalized lipodystrophy and muscular dystrophy. | ENSG00000177469 | polymerase I and transcript release factor | 284119 | PTRF | NA |
| Albumin is a soluble, monomeric protein which comprises about one-half of the blood serum protein. Albumin functions primarily as a carrier protein for steroids, fatty acids, and thyroid hormones and plays a role in stabilizing extracellular fluid volume. Albumin is a globular unglycosylated serum protein of molecular weight 65,000. Albumin is synthesized in the liver as preproalbumin which has an N-terminal peptide that is removed before the nascent protein is released from the rough endoplasmic reticulum. The product, proalbumin, is in turn cleaved in the Golgi vesicles to produce the secreted albumin. | ENSG00000163631 | albumin | 213 | ALB | NA |
| NA | ENSG00000163209 | small proline rich protein 3 | 6707 | SPRR3 | NA |
| Kruppel-like factors (KLFs) are a family of broadly expressed zinc finger transcription factors. KLF2 regulates T-cell trafficking by promoting expression of the lipid-binding receptor S1P1 (S1PR1; MIM 601974) and the selectin CD62L (SELL; MIM 153240) (summary by Weinreich et al., 2009 [PubMed 19592277]). | ENSG00000127528 | Kruppel like factor 2 | 10365 | KLF2 | NA |
| This gene encodes a Plekstrin homology and SEC7 domains-containing protein that functions as a guanine nucleotide exchange factor. The encoded protein regulates signal transduction by activating ADP-ribosylation factor 6. Alternative splicing results in multiple transcript variants. | ENSG00000059915 | pleckstrin and Sec7 domain containing | 5662 | PSD | NA |
| The protein encoded by this gene binds to the ‘plus’ ends of actin monomers and filaments to prevent monomer exchange. The encoded calcium-regulated protein functions in both assembly and disassembly of actin filaments. Defects in this gene are a cause of familial amyloidosis Finnish type (FAF). Multiple transcript variants encoding several different isoforms have been found for this gene. | ENSG00000148180 | gelsolin | 2934 | GSN | NA |
| This gene encodes a member of the chaperonin family. The encoded mitochondrial protein may function as a signaling molecule in the innate immune system. This protein is essential for the folding and assembly of newly imported proteins in the mitochondria. This gene is adjacent to a related family member and the region between the 2 genes functions as a bidirectional promoter. Several pseudogenes have been associated with this gene. Two transcript variants encoding the same protein have been identified for this gene. Mutations associated with this gene cause autosomal recessive spastic paraplegia 13. | ENSG00000144381 | heat shock protein family D (Hsp60) member 1 | 3329 | HSPD1 | NA |
| This gene encodes an SH3 domain-containing adaptor protein. The presence of SH3 domains play a role in this protein’s ability to bind other cytoplasmic molecules and contribute to cystoskeletal organization, cell adhesion and migration, signaling, and gene expression. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000120896 | sorbin and SH3 domain containing 3 | 10174 | SORBS3 | NA |
| This gene shares both structural and functional similarities with the dystrophin gene. It contains an actin-binding N-terminus, a triple coiled-coil repeat central region, and a C-terminus that consists of protein-protein interaction motifs which interact with dystroglycan protein components. The protein encoded by this gene is located at the neuromuscular synapse and myotendinous junctions, where it participates in post-synaptic membrane maintenance and acetylcholine receptor clustering. Mouse studies suggest that this gene may serve as a functional substitute for the dystrophin gene and therefore, may serve as a potential therapeutic alternative to muscular dystrophy which is caused by mutations in the dystrophin gene. Alternative splicing of the utrophin gene has been described; however, the full-length nature of these variants has not yet been determined. | ENSG00000152818 | utrophin | 7402 | UTRN | NA |
| NA | ENSG00000229732 | NA | ENSG00000229732 | AC019349.5 | NA |
| Integrins are integral transmembrane glycoproteins composed of noncovalently linked alpha and beta chains. They participate in cell adhesion as well as cell-surface mediated signalling. This gene encodes an integrin alpha chain and is expressed at high levels in chondrocytes, where it is transcriptionally regulated by AP-2epsilon and Ets-1. The protein encoded by this gene binds to collagen. Alternative splicing results in multiple transcript variants. | ENSG00000143127 | integrin subunit alpha 10 | 8515 | ITGA10 | NA |
| This gene encodes a beta integrin-related protein that is a member of the EGF-like protein family. The encoded protein contains integrin-like cysteine-rich repeats. Alternative splicing results in multiple transcript variants. | ENSG00000198542 | integrin subunit beta like 1 | 9358 | ITGBL1 | NA |
| This gene encodes the alpha chain of type XVI collagen, a member of the FACIT collagen family (fibril-associated collagens with interrupted helices). Members of this collagen family are found in association with fibril-forming collagens such as type I and II, and serve to maintain the integrity of the extracellular matrix. High levels of type XVI collagen have been found in fibroblasts and keratinocytes, and in smooth muscle and amnion. | ENSG00000084636 | collagen type XVI alpha 1 chain | 1307 | COL16A1 | NA |
| This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | ENSG00000175084 | desmin | 1674 | DES | NA |
| This gene encodes an enzyme that oxidizes methionine residues on actin, thereby promoting depolymerization of actin filaments. This protein interacts with and regulates signalling by NEDD9/CAS-L (neural precursor cell expressed, developmentally down-regulated 9). Alternative splicing results in multiple transcript variants. | ENSG00000135596 | microtubule associated monooxygenase, calponin and LIM domain containing 1 | 64780 | MICAL1 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",12,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[13,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| query | X_id | name | summary | symbol | notfound |
|---|---|---|---|---|---|
| ENSG00000107796 | 59 | actin, alpha 2, smooth muscle, aorta | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | ACTA2 | NA |
| ENSG00000149591 | 6876 | transgelin | The protein encoded by this gene is a transformation and shape-change sensitive actin cross-linking/gelling protein found in fibroblasts and smooth muscle. Its expression is down-regulated in many cell lines, and this down-regulation may be an early and sensitive marker for the onset of transformation. A functional role of this protein is unclear. Two transcript variants encoding the same protein have been found for this gene. | TAGLN | NA |
| ENSG00000180139 | ENSG00000180139 | ACTA2 antisense RNA 1 | NA | ACTA2-AS1 | NA |
| ENSG00000075624 | 60 | actin, beta | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | ACTB | NA |
| ENSG00000171401 | 3860 | keratin 13 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | KRT13 | NA |
| ENSG00000186395 | 3858 | keratin 10 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | KRT10 | NA |
| ENSG00000172867 | 3849 | keratin 2 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | KRT2 | NA |
| ENSG00000125868 | 11034 | destrin, actin depolymerizing factor | The product of this gene belongs to the actin-binding proteins ADF family. This family of proteins is responsible for enhancing the turnover rate of actin in vivo. This gene encodes the actin depolymerizing protein that severs actin filaments (F-actin) and binds to actin monomers (G-actin). Two transcript variants encoding distinct isoforms have been identified for this gene. | DSTN | NA |
| ENSG00000145012 | 4026 | LIM domain containing preferred translocation partner in lipoma | This gene encodes a member of a subfamily of LIM domain proteins that are characterized by an N-terminal proline-rich region and three C-terminal LIM domains. The encoded protein localizes to the cell periphery in focal adhesions and may be involved in cell-cell adhesion and cell motility. This protein also shuttles through the nucleus and may function as a transcriptional co-activator. This gene is located at the junction of certain disease-related chromosomal translocations, which result in the expression of chimeric proteins that may promote tumor growth. Alternative splicing results in multiple transcript variants. | LPP | NA |
| ENSG00000096696 | 1832 | desmoplakin | This gene encodes a protein that anchors intermediate filaments to desmosomal plaques and forms an obligate component of functional desmosomes. Mutations in this gene are the cause of several cardiomyopathies and keratodermas, including skin fragility-woolly hair syndrome. Alternative splicing results in multiple transcript variants. | DSP | NA |
| ENSG00000163431 | 25802 | leiomodin 1 | The leiomodin 1 protein has a putative membrane-spanning region and 2 types of tandemly repeated blocks. The transcript is expressed in all tissues tested, with the highest levels in thyroid, eye muscle, skeletal muscle, and ovary. Increased expression of leiomodin 1 may be linked to Graves’ disease and thyroid-associated ophthalmopathy. | LMOD1 | NA |
| ENSG00000115386 | 5967 | regenerating family member 1 alpha | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | REG1A | NA |
| ENSG00000148600 | 92211 | cadherin related family member 1 | This gene belongs to the cadherin superfamily of calcium-dependent cell adhesion molecules. The encoded protein is a photoreceptor-specific cadherin that plays a role in outer segment disc morphogenesis. Mutations in this gene are associated with inherited retinal dystrophies. Alternatively spliced transcript variants encoding different isoforms have been identified. | CDHR1 | NA |
| ENSG00000143248 | 8490 | regulator of G-protein signaling 5 | This gene encodes a member of the regulators of G protein signaling (RGS) family. The RGS proteins are signal transduction molecules which are involved in the regulation of heterotrimeric G proteins by acting as GTPase activators. This gene is a hypoxia-inducible factor-1 dependent, hypoxia-induced gene which is involved in the induction of endothelial apoptosis. This gene is also one of three genes on chromosome 1q contributing to elevated blood pressure. Alternatively spliced transcript variants have been identified. | RGS5 | NA |
| ENSG00000167768 | 3848 | keratin 1 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | KRT1 | NA |
| ENSG00000225630 | ENSG00000225630 | mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 2 pseudogene 28 | NA | MTND2P28 | NA |
| ENSG00000133392 | 4629 | myosin, heavy chain 11, smooth muscle | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | MYH11 | NA |
| ENSG00000175084 | 1674 | desmin | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | DES | NA |
| ENSG00000229732 | ENSG00000229732 | NA | NA | AC019349.5 | NA |
| ENSG00000171476 | 84525 | HOP homeobox | The protein encoded by this gene is a homeodomain protein that lacks certain conserved residues required for DNA binding. It was reported that choriocarcinoma cell lines and tissues failed to express this gene, which suggested the possible involvement of this gene in malignant conversion of placental trophoblasts. Studies in mice suggest that this protein may interact with serum response factor (SRF) and modulate SRF-dependent cardiac-specific gene expression and cardiac development. Multiple alternatively spliced transcript variants have been identified for this gene. | HOPX | NA |
| ENSG00000184009 | 71 | actin gamma 1 | Actins are highly conserved proteins that are involved in various types of cell motility, and maintenance of the cytoskeleton. In vertebrates, three main groups of actin isoforms, alpha, beta and gamma have been identified. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton, and as mediators of internal cell motility. Actin, gamma 1, encoded by this gene, is a cytoplasmic actin found in non-muscle cells. Mutations in this gene are associated with DFNA20/26, a subtype of autosomal dominant non-syndromic sensorineural progressive hearing loss. Alternative splicing results in multiple transcript variants. | ACTG1 | NA |
| ENSG00000167641 | 94274 | protein phosphatase 1 regulatory inhibitor subunit 14A | The protein encoded by this gene belongs to the protein phosphatase 1 (PP1) inhibitor family. This protein is an inhibitor of smooth muscle myosin phosphatase, and has higher inhibitory activity when phosphorylated. Inhibition of myosin phosphatase leads to increased myosin phosphorylation and enhanced smooth muscle contraction. Alternatively spliced transcript variants encoding different isoforms have been noted for this gene. | PPP1R14A | NA |
| ENSG00000072952 | 10335 | murine retrovirus integration site 1 homolog | This gene is similar to a putative mouse tumor suppressor gene (Mrvi1) that is frequently disrupted by mouse AIDS-related virus (MRV). The encoded protein, which is found in the membrane of the endoplasmic reticulum, is similar to Jaw1, a lymphoid-restricted protein whose expression is down-regulated during lymphoid differentiation. This protein is a substrate of cGMP-dependent kinase-1 (PKG1) that can function as a regulator of IP3-induced calcium release. Studies in mouse suggest that MRV integration at Mrvi1 induces myeloid leukemia by altering the expression of a gene important for myeloid cell growth and/or differentiation, and thus this gene may function as a myeloid leukemia tumor suppressor gene. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene, and alternative translation start sites, including a non-AUG (CUG) start site, are used. | MRVI1 | NA |
| ENSG00000079308 | 7145 | tensin 1 | The protein encoded by this gene localizes to focal adhesions, regions of the plasma membrane where the cell attaches to the extracellular matrix. This protein crosslinks actin filaments and contains a Src homology 2 (SH2) domain, which is often found in molecules involved in signal transduction. This protein is a substrate of calpain II. Alternative splicing results in multiple transcript variants encoding different isoforms. | TNS1 | NA |
| ENSG00000145824 | 9547 | C-X-C motif chemokine ligand 14 | This antimicrobial gene belongs to the cytokine gene family which encode secreted proteins involved in immunoregulatory and inflammatory processes. The protein encoded by this gene is structurally related to the CXC (Cys-X-Cys) subfamily of cytokines. Members of this subfamily are characterized by two cysteines separated by a single amino acid. This cytokine displays chemotactic activity for monocytes but not for lymphocytes, dendritic cells, neutrophils or macrophages. It has been implicated that this cytokine is involved in the homeostasis of monocyte-derived macrophages rather than in inflammation. | CXCL14 | NA |
| ENSG00000170477 | 3851 | keratin 4 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | KRT4 | NA |
| ENSG00000116133 | 1718 | 24-dehydrocholesterol reductase | This gene encodes a flavin adenine dinucleotide (FAD)-dependent oxidoreductase which catalyzes the reduction of the delta-24 double bond of sterol intermediates during cholesterol biosynthesis. The protein contains a leader sequence that directs it to the endoplasmic reticulum membrane. Missense mutations in this gene have been associated with desmosterolosis. Also, reduced expression of the gene occurs in the temporal cortex of Alzheimer disease patients and overexpression has been observed in adrenal gland cancer cells. | DHCR24 | NA |
| ENSG00000198467 | 7169 | tropomyosin 2 (beta) | This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | TPM2 | NA |
| ENSG00000115414 | 2335 | fibronectin 1 | This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | FN1 | NA |
| ENSG00000148795 | 1586 | cytochrome P450 family 17 subfamily A member 1 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum. It has both 17alpha-hydroxylase and 17,20-lyase activities and is a key enzyme in the steroidogenic pathway that produces progestins, mineralocorticoids, glucocorticoids, androgens, and estrogens. Mutations in this gene are associated with isolated steroid-17 alpha-hydroxylase deficiency, 17-alpha-hydroxylase/17,20-lyase deficiency, pseudohermaphroditism, and adrenal hyperplasia. | CYP17A1 | NA |
| ENSG00000128591 | 2318 | filamin C | This gene encodes one of three related filamin genes, specifically gamma filamin. These filamin proteins crosslink actin filaments into orthogonal networks in cortical cytoplasm and participate in the anchoring of membrane proteins for the actin cytoskeleton. Three functional domains exist in filamin: an N-terminal filamentous actin-binding domain, a C-terminal self-association domain, and a membrane glycoprotein-binding domain. Two transcript variants encoding different isoforms have been found for this gene. | FLNC | NA |
| ENSG00000196616 | 125 | alcohol dehydrogenase 1B (class I), beta polypeptide | The protein encoded by this gene is a member of the alcohol dehydrogenase family. Members of this enzyme family metabolize a wide variety of substrates, including ethanol, retinol, other aliphatic alcohols, hydroxysteroids, and lipid peroxidation products. This encoded protein, consisting of several homo- and heterodimers of alpha, beta, and gamma subunits, exhibits high activity for ethanol oxidation and plays a major role in ethanol catabolism. Three genes encoding alpha, beta and gamma subunits are tandemly organized in a genomic segment as a gene cluster. Two transcript variants encoding different isoforms have been found for this gene. | ADH1B | NA |
| ENSG00000157110 | 11030 | RNA binding protein with multiple splicing | This gene encodes a member of the RNA recognition motif family of RNA-binding proteins. The RNA recognition motif is between 80-100 amino acids in length and family members contain one to four copies of the motif. The RNA recognition motif consists of two short stretches of conserved sequence, as well as a few highly conserved hydrophobic residues. The encoded protein has a single, putative RNA recognition motif in its N-terminus. Alternative splicing results in multiple transcript variants encoding different isoforms. | RBPMS | NA |
| ENSG00000143546 | 6279 | S100 calcium binding protein A8 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and as a cytokine. Altered expression of this protein is associated with the disease cystic fibrosis. Multiple transcript variants encoding different isoforms have been found for this gene. | S100A8 | NA |
| ENSG00000159176 | 1465 | cysteine and glycine rich protein 1 | This gene encodes a member of the cysteine-rich protein (CSRP) family. This gene family includes a group of LIM domain proteins, which may be involved in regulatory processes important for development and cellular differentiation. The LIM/double zinc-finger motif found in this gene product occurs in proteins with critical functions in gene regulation, cell growth, and somatic differentiation. Alternatively spliced transcript variants have been described. | CSRP1 | NA |
| ENSG00000106809 | 4969 | osteoglycin | This gene encodes a member of the small leucine-rich proteoglycan (SLRP) family of proteins. The encoded protein induces ectopic bone formation in conjunction with transforming growth factor beta and may regulate osteoblast differentiation. High expression of the encoded protein may be associated with elevated heart left ventricular mass. Alternative splicing results in multiple transcript variants. | OGN | NA |
| ENSG00000163209 | 6707 | small proline rich protein 3 | NA | SPRR3 | NA |
| ENSG00000197616 | 4624 | myosin, heavy chain 6, cardiac muscle, alpha | Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. | MYH6 | NA |
| ENSG00000269936 | ENSG00000269936 | NA | NA | RP11-394O4.5 | NA |
| ENSG00000049540 | 2006 | elastin | This gene encodes a protein that is one of the two components of elastic fibers. The encoded protein is rich in hydrophobic amino acids such as glycine and proline, which form mobile hydrophobic regions bounded by crosslinks between lysine residues. Deletions and mutations in this gene are associated with supravalvular aortic stenosis (SVAS) and autosomal dominant cutis laxa. Multiple transcript variants encoding different isoforms have been found for this gene. | ELN | NA |
| ENSG00000011465 | 1634 | decorin | This gene encodes a member of the small leucine-rich proteoglycan family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature protein. This protein plays a role in collagen fibril assembly. Binding of this protein to multiple cell surface receptors mediates its role in tumor suppression, including a stimulatory effect on autophagy and inflammation and an inhibitory effect on angiogenesis and tumorigenesis. This gene and the related gene biglycan are thought to be the result of a gene duplication. Mutations in this gene are associated with congenital stromal corneal dystrophy in human patients. | DCN | NA |
| ENSG00000203782 | 4014 | loricrin | This gene encodes loricrin, a major protein component of the cornified cell envelope found in terminally differentiated epidermal cells. Mutations in this gene are associated with Vohwinkel’s syndrome and progressive symmetric erythrokeratoderma, both inherited skin diseases. | LOR | NA |
| ENSG00000257017 | 3240 | haptoglobin | This gene encodes a preproprotein, which is processed to yield both alpha and beta chains, which subsequently combine as a tetramer to produce haptoglobin. Haptoglobin functions to bind free plasma hemoglobin, which allows degradative enzymes to gain access to the hemoglobin, while at the same time preventing loss of iron through the kidneys and protecting the kidneys from damage by hemoglobin. Mutations in this gene and/or its regulatory regions cause ahaptoglobinemia or hypohaptoglobinemia. This gene has also been linked to diabetic nephropathy, the incidence of coronary artery disease in type 1 diabetes, Crohn’s disease, inflammatory disease behavior, primary sclerosing cholangitis, susceptibility to idiopathic Parkinson’s disease, and a reduced incidence of Plasmodium falciparum malaria. The protein encoded also exhibits antimicrobial activity against bacteria. A similar duplicated gene is located next to this gene on chromosome 16. Multiple transcript variants encoding different isoforms have been found for this gene. | HP | NA |
| ENSG00000106123 | 2051 | EPH receptor B6 | This gene encodes a member of a family of transmembrane proteins that function as receptors for ephrin-B family proteins. Unlike other members of this family, the encoded protein does not contain a functional kinase domain. Activity of this protein can influence cell adhesion and migration. Expression of this gene is downregulated during tumor progression, suggesting that the protein may suppress tumor invasion and metastasis. Alternative splicing results in multiple transcript variants. | EPHB6 | NA |
| ENSG00000111341 | 4256 | matrix Gla protein | The protein encoded by this gene is secreted and likely acts as an inhibitor of bone formation. The encoded protein is found in the organic matrix of bone and cartilage. Defects in this gene are a cause of Keutel syndrome (KS). Two transcript variants encoding different isoforms have been found for this gene. | MGP | NA |
| ENSG00000077943 | 8516 | integrin subunit alpha 8 | Integrins are heterodimeric transmembrane receptor proteins that mediate numerous cellular processes including cell adhesion, cytoskeletal rearrangement, and activation of cell signaling pathways. Integrins are composed of alpha and beta subunits. This gene encodes the alpha 8 subunit of the heterodimeric integrin alpha8beta1 protein. The encoded protein is a single-pass type 1 membrane protein that contains multiple FG-GAP repeats. This repeat is predicted to fold into a beta propeller structure. This gene regulates the recruitment of mesenchymal cells into epithelial structures, mediates cell-cell interactions, and regulates neurite outgrowth of sensory and motor neurons. The integrin alpha8beta1 protein thus plays an important role in wound-healing and organogenesis. Mutations in this gene have been associated with renal hypodysplasia/aplasia-1 (RHDA1) and with several animal models of chronic kidney disease. Alternate splicing results in multiple transcript variants encoding distinct isoforms. | ITGA8 | NA |
| ENSG00000204388 | 3304 | heat shock protein family A (Hsp70) member 1B | This intronless gene encodes a 70kDa heat shock protein which is a member of the heat shock protein 70 family. In conjuction with other heat shock proteins, this protein stabilizes existing proteins against aggregation and mediates the folding of newly translated proteins in the cytosol and in organelles. It is also involved in the ubiquitin-proteasome pathway through interaction with the AU-rich element RNA-binding protein 1. The gene is located in the major histocompatibility complex class III region, in a cluster with two closely related genes which encode similar proteins. | HSPA1B | NA |
| ENSG00000080573 | 50509 | collagen type V alpha 3 | This gene encodes an alpha chain for one of the low abundance fibrillar collagens. Fibrillar collagen molecules are trimers that can be composed of one or more types of alpha chains. Type V collagen is found in tissues containing type I collagen and appears to regulate the assembly of heterotypic fibers composed of both type I and type V collagen. This gene product is closely related to type XI collagen and it is possible that the collagen chains of types V and XI constitute a single collagen type with tissue-specific chain combinations. Mutations in this gene are thought to be responsible for the symptoms of a subset of patients with Ehlers-Danlos syndrome type III. Messages of several sizes can be detected in northern blots but sequence information cannot confirm the identity of the shorter messages. | COL5A3 | NA |
| ENSG00000197746 | 5660 | prosaposin | This gene encodes a highly conserved preproprotein that is proteolytically processed to generate four main cleavage products including saposins A, B, C, and D. Each domain of the precursor protein is approximately 80 amino acid residues long with nearly identical placement of cysteine residues and glycosylation sites. Saposins A-D localize primarily to the lysosomal compartment where they facilitate the catabolism of glycosphingolipids with short oligosaccharide groups. The precursor protein exists both as a secretory protein and as an integral membrane protein and has neurotrophic activities. Mutations in this gene have been associated with Gaucher disease and metachromatic leukodystrophy. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that is proteolytically processed. | PSAP | NA |
| ENSG00000178585 | 56998 | catenin beta interacting protein 1 | The protein encoded by this gene binds CTNNB1 and prevents interaction between CTNNB1 and TCF family members. The encoded protein is a negative regulator of the Wnt signaling pathway. Two transcript variants encoding the same protein have been found for this gene. | CTNNBIP1 | NA |
| ENSG00000068078 | 2261 | fibroblast growth factor receptor 3 | This gene encodes a member of the fibroblast growth factor receptor (FGFR) family, with its amino acid sequence being highly conserved between members and among divergent species. FGFR family members differ from one another in their ligand affinities and tissue distribution. A full-length representative protein would consist of an extracellular region, composed of three immunoglobulin-like domains, a single hydrophobic membrane-spanning segment and a cytoplasmic tyrosine kinase domain. The extracellular portion of the protein interacts with fibroblast growth factors, setting in motion a cascade of downstream signals, ultimately influencing mitogenesis and differentiation. This particular family member binds acidic and basic fibroblast growth hormone and plays a role in bone development and maintenance. Mutations in this gene lead to craniosynostosis and multiple types of skeletal dysplasia. Three alternatively spliced transcript variants that encode different protein isoforms have been described. | FGFR3 | NA |
| ENSG00000163631 | 213 | albumin | Albumin is a soluble, monomeric protein which comprises about one-half of the blood serum protein. Albumin functions primarily as a carrier protein for steroids, fatty acids, and thyroid hormones and plays a role in stabilizing extracellular fluid volume. Albumin is a globular unglycosylated serum protein of molecular weight 65,000. Albumin is synthesized in the liver as preproalbumin which has an N-terminal peptide that is removed before the nascent protein is released from the rough endoplasmic reticulum. The product, proalbumin, is in turn cleaved in the Golgi vesicles to produce the secreted albumin. | ALB | NA |
| ENSG00000087266 | 6452 | SH3 domain binding protein 2 | The protein encoded by this gene has an N-terminal pleckstrin homology (PH) domain, an SH3-binding proline-rich region, and a C-terminal SH2 domain. The protein binds to the SH3 domains of several proteins including the ABL1 and SYK protein tyrosine kinases , and functions as a cytoplasmic adaptor protein to positively regulate transcriptional activity in T, natural killer (NK), and basophilic cells. Mutations in this gene result in cherubism. Multiple transcript variants encoding different isoforms have been found for this gene. | SH3BP2 | NA |
| ENSG00000259627 | ENSG00000259627 | NA | NA | RP11-244F12.2 | NA |
| ENSG00000137857 | 53905 | dual oxidase 1 | The protein encoded by this gene is a glycoprotein and a member of the NADPH oxidase family. The synthesis of thyroid hormone is catalyzed by a protein complex located at the apical membrane of thyroid follicular cells. This complex contains an iodide transporter, thyroperoxidase, and a peroxide generating system that includes proteins encoded by this gene and the similar DUOX2 gene. This protein is known as dual oxidase because it has both a peroxidase homology domain and a gp91phox domain. This protein generates hydrogen peroxide and thereby plays a role in the activity of thyroid peroxidase, lactoperoxidase, and in lactoperoxidase-mediated antimicrobial defense at mucosal surfaces. Two alternatively spliced transcript variants encoding the same protein have been described for this gene. | DUOX1 | NA |
| ENSG00000140416 | 7168 | tropomyosin 1 (alpha) | This gene is a member of the tropomyosin family of highly conserved, widely distributed actin-binding proteins involved in the contractile system of striated and smooth muscles and the cytoskeleton of non-muscle cells. Tropomyosin is composed of two alpha-helical chains arranged as a coiled-coil. It is polymerized end to end along the two grooves of actin filaments and provides stability to the filaments. The encoded protein is one type of alpha helical chain that forms the predominant tropomyosin of striated muscle, where it also functions in association with the troponin complex to regulate the calcium-dependent interaction of actin and myosin during muscle contraction. In smooth muscle and non-muscle cells, alternatively spliced transcript variants encoding a range of isoforms have been described. Mutations in this gene are associated with type 3 familial hypertrophic cardiomyopathy. | TPM1 | NA |
| ENSG00000237973 | ENSG00000237973 | MT-CO1 pseudogene 12 | NA | MTCO1P12 | NA |
| ENSG00000178372 | 51806 | calmodulin like 5 | This gene encodes a novel calcium binding protein expressed in the epidermis and related to the calmodulin family of calcium binding proteins. Functional studies with recombinant protein demonstrate it does bind calcium and undergoes a conformational change when it does so. Abundant expression is detected only in reconstructed epidermis and is restricted to differentiating keratinocytes. In addition, it can associate with transglutaminase 3, shown to be a key enzyme in the terminal differentiation of keratinocytes. | CALML5 | NA |
| ENSG00000172023 | 5968 | regenerating family member 1 beta | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV based on the primary structures of the encoded proteins. This gene encodes a protein secreted by the exocrine pancreas that is highly similar to the REG1A protein. The related REG1A protein is associated with islet cell regeneration and diabetogenesis, and may be involved in pancreatic lithogenesis. Reg family members REG1A, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | REG1B | NA |
| ENSG00000159251 | 70 | actin, alpha, cardiac muscle 1 | Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC). | ACTC1 | NA |
| ENSG00000065534 | 4638 | myosin light chain kinase | This gene, a muscle member of the immunoglobulin gene superfamily, encodes myosin light chain kinase which is a calcium/calmodulin dependent enzyme. This kinase phosphorylates myosin regulatory light chains to facilitate myosin interaction with actin filaments to produce contractile activity. This gene encodes both smooth muscle and nonmuscle isoforms. In addition, using a separate promoter in an intron in the 3’ region, it encodes telokin, a small protein identical in sequence to the C-terminus of myosin light chain kinase, that is independently expressed in smooth muscle and functions to stabilize unphosphorylated myosin filaments. A pseudogene is located on the p arm of chromosome 3. Four transcript variants that produce four isoforms of the calcium/calmodulin dependent enzyme have been identified as well as two transcripts that produce two isoforms of telokin. Additional variants have been identified but lack full length transcripts. | MYLK | NA |
| ENSG00000072110 | 87 | actinin alpha 1 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a nonmuscle, cytoskeletal, alpha actinin isoform and maps to the same site as the structurally similar erythroid beta spectrin gene. Three transcript variants encoding different isoforms have been found for this gene. | ACTN1 | NA |
| ENSG00000077782 | 2260 | fibroblast growth factor receptor 1 | The protein encoded by this gene is a member of the fibroblast growth factor receptor (FGFR) family, where amino acid sequence is highly conserved between members and throughout evolution. FGFR family members differ from one another in their ligand affinities and tissue distribution. A full-length representative protein consists of an extracellular region, composed of three immunoglobulin-like domains, a single hydrophobic membrane-spanning segment and a cytoplasmic tyrosine kinase domain. The extracellular portion of the protein interacts with fibroblast growth factors, setting in motion a cascade of downstream signals, ultimately influencing mitogenesis and differentiation. This particular family member binds both acidic and basic fibroblast growth factors and is involved in limb induction. Mutations in this gene have been associated with Pfeiffer syndrome, Jackson-Weiss syndrome, Antley-Bixler syndrome, osteoglophonic dysplasia, and autosomal dominant Kallmann syndrome 2. Chromosomal aberrations involving this gene are associated with stem cell myeloproliferative disorder and stem cell leukemia lymphoma syndrome. Alternatively spliced variants which encode different protein isoforms have been described; however, not all variants have been fully characterized. | FGFR1 | NA |
| ENSG00000163017 | 72 | actin, gamma 2, smooth muscle, enteric | Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | ACTG2 | NA |
| ENSG00000161634 | 117159 | dermcidin | This antimicrobial gene encodes a secreted protein that is subsequently processed into mature peptides of distinct biological activities. The C-terminal peptide is constitutively expressed in sweat and has antibacterial and antifungal activities. The N-terminal peptide, also known as diffusible survival evasion peptide, promotes neural cell survival under conditions of severe oxidative stress. A glycosylated form of the N-terminal peptide may be associated with cachexia (muscle wasting) in cancer patients. Alternative splicing results in multiple transcript variants encoding different isoforms. | DCD | NA |
| ENSG00000143126 | 1952 | cadherin EGF LAG seven-pass G-type receptor 2 | The protein encoded by this gene is a member of the flamingo subfamily, part of the cadherin superfamily. The flamingo subfamily consists of nonclassic-type cadherins; a subpopulation that does not interact with catenins. The flamingo cadherins are located at the plasma membrane and have nine cadherin domains, seven epidermal growth factor-like repeats and two laminin A G-type repeats in their ectodomain. They also have seven transmembrane domains, a characteristic unique to this subfamily. It is postulated that these proteins are receptors involved in contact-mediated communication, with cadherin domains acting as homophilic binding regions and the EGF-like domains involved in cell adhesion and receptor-ligand interactions. The specific function of this particular member has not been determined. | CELSR2 | NA |
| ENSG00000156113 | 3778 | potassium calcium-activated channel subfamily M alpha 1 | MaxiK channels are large conductance, voltage and calcium-sensitive potassium channels which are fundamental to the control of smooth muscle tone and neuronal excitability. MaxiK channels can be formed by 2 subunits: the pore-forming alpha subunit, which is the product of this gene, and the modulatory beta subunit. Intracellular calcium regulates the physical association between the alpha and beta subunits. Alternatively spliced transcript variants encoding different isoforms have been identified. | KCNMA1 | NA |
| ENSG00000081277 | 5317 | plakophilin 1 | This gene encodes a member of the arm-repeat (armadillo) and plakophilin gene families. Plakophilin proteins contain numerous armadillo repeats, localize to cell desmosomes and nuclei, and participate in linking cadherins to intermediate filaments in the cytoskeleton. This protein may be involved in molecular recruitment and stabilization during desmosome formation. Mutations in this gene have been associated with the ectodermal dysplasia/skin fragility syndrome. Two transcript variants encoding different isoforms have been found for this gene. | PKP1 | NA |
| ENSG00000135046 | 301 | annexin A1 | This gene encodes a membrane-localized protein that binds phospholipids. This protein inhibits phospholipase A2 and has anti-inflammatory activity. Loss of function or expression of this gene has been detected in multiple tumors. | ANXA1 | NA |
| ENSG00000132470 | 3691 | integrin subunit beta 4 | Integrins are heterodimers comprised of alpha and beta subunits, that are noncovalently associated transmembrane glycoprotein receptors. Different combinations of alpha and beta polypeptides form complexes that vary in their ligand-binding specificities. Integrins mediate cell-matrix or cell-cell adhesion, and transduced signals that regulate gene expression and cell growth. This gene encodes the integrin beta 4 subunit, a receptor for the laminins. This subunit tends to associate with alpha 6 subunit and is likely to play a pivotal role in the biology of invasive carcinoma. Mutations in this gene are associated with epidermolysis bullosa with pyloric atresia. Multiple alternatively spliced transcript variants encoding distinct isoforms have been found for this gene. | ITGB4 | NA |
| ENSG00000106772 | 158471 | prune homolog 2 | The protein encoded by this gene belongs to the B-cell CLL/lymphoma 2 and adenovirus E1B 19 kDa interacting family, whose members play roles in many cellular processes including apotosis, cell transformation, and synaptic function. Several functions for this protein have been demonstrated including suppression of Ras homolog family member A activity, which results in reduced stress fiber formation and suppression of oncogenic cellular transformation. A high molecular weight isoform of this protein has also been shown to colocalize with Adaptor protein complex 2, beta-Adaptin and endodermal markers, suggesting an involvement in post-endocytic trafficking. In prostate cancer cells, this gene acts as a tumor suppressor and its expression is regulated by prostate cancer antigen 3, a non-protein coding gene on the opposite DNA strand in an intron of this gene. Prostate cancer antigen 3 regulates levels of this gene through formation of a double-stranded RNA that undergoes adenosine deaminase actin on RNA-dependent adenosine-to-inosine RNA editing. Alternative splicing results in multiple transcript variants. | PRUNE2 | NA |
| ENSG00000113140 | 6678 | secreted protein acidic and cysteine rich | This gene encodes a cysteine-rich acidic matrix-associated protein. The encoded protein is required for the collagen in bone to become calcified but is also involved in extracellular matrix synthesis and promotion of changes to cell shape. The gene product has been associated with tumor suppression but has also been correlated with metastasis based on changes to cell shape which can promote tumor cell invasion. Three transcript variants encoding different isoforms have been found for this gene. | SPARC | NA |
| ENSG00000138735 | 8654 | phosphodiesterase 5A | This gene encodes a cGMP-binding, cGMP-specific phosphodiesterase, a member of the cyclic nucleotide phosphodiesterase family. This phosphodiesterase specifically hydrolyzes cGMP to 5’-GMP. It is involved in the regulation of intracellular concentrations of cyclic nucleotides and is important for smooth muscle relaxation in the cardiovascular system. Alternative splicing of this gene results in three transcript variants encoding distinct isoforms. | PDE5A | NA |
| ENSG00000122786 | 800 | caldesmon 1 | This gene encodes a calmodulin- and actin-binding protein that plays an essential role in the regulation of smooth muscle and nonmuscle contraction. The conserved domain of this protein possesses the binding activities to Ca(2+)-calmodulin, actin, tropomyosin, myosin, and phospholipids. This protein is a potent inhibitor of the actin-tropomyosin activated myosin MgATPase, and serves as a mediating factor for Ca(2+)-dependent inhibition of smooth muscle contraction. Alternative splicing of this gene results in multiple transcript variants encoding distinct isoforms. | CALD1 | NA |
| ENSG00000256309 | NA | NA | NA | NA | TRUE |
| ENSG00000103034 | 65009 | NDRG family member 4 | This gene is a member of the N-myc downregulated gene family which belongs to the alpha/beta hydrolase superfamily. The protein encoded by this gene is a cytoplasmic protein that is required for cell cycle progression and survival in primary astrocytes and may be involved in the regulation of mitogenic signalling in vascular smooth muscles cells. Alternative splicing results in multiple transcripts encoding different isoforms. | NDRG4 | NA |
| ENSG00000125730 | 718 | complement component 3 | Complement component C3 plays a central role in the activation of complement system. Its activation is required for both classical and alternative complement activation pathways. The encoded preproprotein is proteolytically processed to generate alpha and beta subunits that form the mature protein, which is then further processed to generate numerous peptide products. The C3a peptide, also known as the C3a anaphylatoxin, modulates inflammation and possesses antimicrobial activity. Mutations in this gene are associated with atypical hemolytic uremic syndrome and age-related macular degeneration in human patients. | C3 | NA |
| ENSG00000187605 | 200424 | tet methylcytosine dioxygenase 3 | Members of the ten-eleven translocation (TET) gene family, including TET3, play a role in the DNA methylation process (Langemeijer et al., 2009 [PubMed 19923888]). | TET3 | NA |
| ENSG00000122304 | 5620 | protamine 2 | Protamines substitute for histones in the chromatin of sperm during the haploid phase of spermatogenesis, and are the major DNA-binding proteins in the nucleus of sperm in many vertebrates. They package the sperm DNA into a highly condensed complex in a volume less than 5% of a somatic cell nucleus. Many mammalian species have only one protamine (protamine 1); however, a few species, including human and mouse, have two. This gene encodes protamine 2, which is cleaved to give rise to a family of protamine 2 peptides. Alternatively spliced transcript variants have also been found for this gene. | PRM2 | NA |
| ENSG00000108828 | 10493 | vesicle amine transport 1 | Synaptic vesicles are responsible for regulating the storage and release of neurotransmitters in the nerve terminal. The protein encoded by this gene is an abundant integral membrane protein of cholinergic synaptic vesicles and is thought to be involved in vesicular transport. It belongs to the quinone oxidoreductase subfamily of zinc-containing alcohol dehydrogenase proteins. | VAT1 | NA |
| ENSG00000136153 | 4008 | LIM domain 7 | This gene encodes a protein containing a calponin homology (CH) domain, a PDZ domain, and a LIM domain, and may be involved in protein-protein interactions. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene, however, the full-length nature of some variants is not known. | LMO7 | NA |
| ENSG00000186081 | 3852 | keratin 5 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the basal layer of the epidermis with family member KRT14. Mutations in these genes have been associated with a complex of diseases termed epidermolysis bullosa simplex. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | KRT5 | NA |
| ENSG00000173801 | 3728 | junction plakoglobin | This gene encodes a major cytoplasmic protein which is the only known constituent common to submembranous plaques of both desmosomes and intermediate junctions. This protein forms distinct complexes with cadherins and desmosomal cadherins and is a member of the catenin family since it contains a distinct repeating amino acid motif called the armadillo repeat. Mutation in this gene has been associated with Naxos disease. Alternative splicing occurs in this gene; however, not all transcripts have been fully described. | JUP | NA |
| ENSG00000175183 | 1466 | cysteine and glycine rich protein 2 | CSRP2 is a member of the CSRP family of genes, encoding a group of LIM domain proteins, which may be involved in regulatory processes important for development and cellular differentiation. CRP2 contains two copies of the cysteine-rich amino acid sequence motif (LIM) with putative zinc-binding activity, and may be involved in regulating ordered cell growth. Other genes in the family include CSRP1 and CSRP3. Alternative splicing results in multiple transcript variants. | CSRP2 | NA |
| ENSG00000185201 | 10581 | interferon induced transmembrane protein 2 | NA | IFITM2 | NA |
| ENSG00000164266 | 6690 | serine peptidase inhibitor, Kazal type 1 | The protein encoded by this gene is a trypsin inhibitor, which is secreted from pancreatic acinar cells into pancreatic juice. It is thought to function in the prevention of trypsin-catalyzed premature activation of zymogens within the pancreas and the pancreatic duct. Mutations in this gene are associated with hereditary pancreatitis and tropical calcific pancreatitis. | SPINK1 | NA |
| ENSG00000118194 | 7139 | troponin T2, cardiac type | The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. Mutations in this gene have been associated with familial hypertrophic cardiomyopathy as well as with dilated cardiomyopathy. Transcripts for this gene undergo alternative splicing that results in many tissue-specific isoforms, however, the full-length nature of some of these variants has not yet been determined. | TNNT2 | NA |
| ENSG00000009307 | 7812 | cold shock domain containing E1 | NA | CSDE1 | NA |
| ENSG00000182871 | 80781 | collagen type XVIII alpha 1 chain | This gene encodes the alpha chain of type XVIII collagen. This collagen is one of the multiplexins, extracellular matrix proteins that contain multiple triple-helix domains (collagenous domains) interrupted by non-collagenous domains. A long isoform of the protein has an N-terminal domain that is homologous to the extracellular part of frizzled receptors. Proteolytic processing at several endogenous cleavage sites in the C-terminal domain results in production of endostatin, a potent antiangiogenic protein that is able to inhibit angiogenesis and tumor growth. Mutations in this gene are associated with Knobloch syndrome. The main features of this syndrome involve retinal abnormalities, so type XVIII collagen may play an important role in retinal structure and in neural tube closure. Alternative splicing results in multiple transcript variants. | COL18A1 | NA |
| ENSG00000155657 | 7273 | titin | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | TTN | NA |
| ENSG00000163754 | 2992 | glycogenin 1 | This gene encodes a member of the glycogenin family. Glycogenin is a glycosyltransferase that catalyzes the formation of a short glucose polymer from uridine diphosphate glucose in an autoglucosylation reaction. This reaction is followed by elongation and branching of the polymer, catalyzed by glycogen synthase and branching enzyme, to form glycogen. This gene is expressed in muscle and other tissues. Mutations in this gene result in glycogen storage disease XV. This gene has pseudogenes on chromosomes 1, 8 and 13 respectively. Alternatively spliced transcript variants encoding different isoforms have been identified. | GYG1 | NA |
| ENSG00000120885 | 1191 | clusterin | The protein encoded by this gene is a secreted chaperone that can under some stress conditions also be found in the cell cytosol. It has been suggested to be involved in several basic biological events such as cell death, tumor progression, and neurodegenerative disorders. Alternate splicing results in both coding and non-coding variants. | CLU | NA |
| ENSG00000101335 | 10398 | myosin light chain 9 | Myosin, a structural component of muscle, consists of two heavy chains and four light chains. The protein encoded by this gene is a myosin light chain that may regulate muscle contraction by modulating the ATPase activity of myosin heads. The encoded protein binds calcium and is activated by myosin light chain kinase. Two transcript variants encoding different isoforms have been found for this gene. | MYL9 | NA |
| ENSG00000109610 | 6649 | superoxide dismutase 3, extracellular | This gene encodes a member of the superoxide dismutase (SOD) protein family. SODs are antioxidant enzymes that catalyze the conversion of superoxide radicals into hydrogen peroxide and oxygen, which may protect the brain, lungs, and other tissues from oxidative stress. Proteolytic processing of the encoded protein results in the formation of two distinct homotetramers that differ in their ability to interact with the extracellular matrix (ECM). Homotetramers consisting of the intact protein, or type C subunit, exhibit high affinity for heparin and are anchored to the ECM. Homotetramers consisting of a proteolytically cleaved form of the protein, or type A subunit, exhibit low affinity for heparin and do not interact with the ECM. A mutation in this gene may be associated with increased heart disease risk. | SOD3 | NA |
| ENSG00000143536 | 49860 | cornulin | This gene encodes a member of the ‘fused gene’ family of proteins, which contain N-terminus EF-hand domains and multiple tandem peptide repeats. The encoded protein contains two EF-hand Ca2+ binding domains in its N-terminus and two glutamine- and threonine-rich 60 amino acid repeats in its C-terminus. This gene, also known as squamous epithelial heat shock protein 53, may play a role in the mucosal/epithelial immune response and epidermal differentiation. | CRNN | NA |
| ENSG00000185532 | 5592 | protein kinase, cGMP-dependent, type I | Mammals have three different isoforms of cyclic GMP-dependent protein kinase (Ialpha, Ibeta, and II). These PRKG isoforms act as key mediators of the nitric oxide/cGMP signaling pathway and are important components of many signal transduction processes in diverse cell types. This PRKG1 gene on human chromosome 10 encodes the soluble Ialpha and Ibeta isoforms of PRKG by alternative transcript splicing. A separate gene on human chromosome 4, PRKG2, encodes the membrane-bound PRKG isoform II. The PRKG1 proteins play a central role in regulating cardiovascular and neuronal functions in addition to relaxing smooth muscle tone, preventing platelet aggregation, and modulating cell growth. This gene is most strongly expressed in all types of smooth muscle, platelets, cerebellar Purkinje cells, hippocampal neurons, and the lateral amygdala. Isoforms Ialpha and Ibeta have identical cGMP-binding and catalytic domains but differ in their leucine/isoleucine zipper and autoinhibitory sequences and therefore differ in their dimerization substrates and kinase enzyme activity. | PRKG1 | NA |
| ENSG00000132329 | 10267 | receptor activity modifying protein 1 | The protein encoded by this gene is a member of the RAMP family of single-transmembrane-domain proteins, called receptor (calcitonin) activity modifying proteins (RAMPs). RAMPs are type I transmembrane proteins with an extracellular N terminus and a cytoplasmic C terminus. RAMPs are required to transport calcitonin-receptor-like receptor (CRLR) to the plasma membrane. CRLR, a receptor with seven transmembrane domains, can function as either a calcitonin-gene-related peptide (CGRP) receptor or an adrenomedullin receptor, depending on which members of the RAMP family are expressed. In the presence of this (RAMP1) protein, CRLR functions as a CGRP receptor. The RAMP1 protein is involved in the terminal glycosylation, maturation, and presentation of the CGRP receptor to the cell surface. Alternative splicing results in multiple transcript variants encoding different isoforms. | RAMP1 | NA |
| ENSG00000244734 | 3043 | hemoglobin subunit beta | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | HBB | NA |
| ENSG00000147872 | 123 | perilipin 2 | The protein encoded by this gene belongs to the perilipin family, members of which coat intracellular lipid storage droplets. This protein is associated with the lipid globule surface membrane material, and maybe involved in development and maintenance of adipose tissue. However, it is not restricted to adipocytes as previously thought, but is found in a wide range of cultured cell lines, including fibroblasts, endothelial and epithelial cells, and tissues, such as lactating mammary gland, adrenal cortex, Sertoli and Leydig cells, and hepatocytes in alcoholic liver cirrhosis, suggesting that it may serve as a marker of lipid accumulation in diverse cell types and diseases. Alternatively spliced transcript variants have been found for this gene. | PLIN2 | NA |
| ENSG00000159069 | 54461 | F-box and WD repeat domain containing 5 | This gene encodes a member of the F-box protein family, members of which are characterized by an approximately 40 amino acid motif, the F-box. The F-box proteins constitute one of the four subunits of ubiquitin protein ligase complex called SCFs (SKP1-cullin-F-box), which function in phosphorylation-dependent ubiquitination. The F-box proteins are divided into three classes: Fbws containing WD-40 domains, Fbls containing leucine-rich repeats, and Fbxs containing either different protein-protein interaction modules or no recognizable motifs. The protein encoded by this gene contains WD-40 domains, in addition to an F-box motif, so it belongs to the Fbw class. Alternatively spliced transcript variants encoding distinct isoforms have been identified for this gene, however, they were found to be nonsense-mediated mRNA decay (NMD) candidates, hence not represented. | FBXW5 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",13,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[14,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
kable(as.data.frame(out))
| symbol | X_id | query | name | summary |
|---|---|---|---|---|
| S100A9 | 6280 | ENSG00000163220 | S100 calcium binding protein A9 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and altered expression of this protein is associated with the disease cystic fibrosis. This antimicrobial protein exhibits antifungal and antibacterial activity. |
| HBB | 3043 | ENSG00000244734 | hemoglobin subunit beta | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. |
| S100A8 | 6279 | ENSG00000143546 | S100 calcium binding protein A8 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and as a cytokine. Altered expression of this protein is associated with the disease cystic fibrosis. Multiple transcript variants encoding different isoforms have been found for this gene. |
| HBA2 | 3040 | ENSG00000188536 | hemoglobin subunit alpha 2 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. |
| KRT13 | 3860 | ENSG00000171401 | keratin 13 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. |
| REG1A | 5967 | ENSG00000115386 | regenerating family member 1 alpha | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. |
| KRT4 | 3851 | ENSG00000170477 | keratin 4 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. |
| SVIL | 6840 | ENSG00000197321 | supervillin | This gene encodes a bipartite protein with distinct amino- and carboxy-terminal domains. The amino-terminus contains nuclear localization signals and the carboxy-terminus contains numerous consecutive sequences with extensive similarity to proteins in the gelsolin family of actin-binding proteins, which cap, nucleate, and/or sever actin filaments. The gene product is tightly associated with both actin filaments and plasma membranes, suggesting a role as a high-affinity link between the actin cytoskeleton and the membrane. The encoded protein appears to aid in both myosin II assembly during cell spreading and disassembly of focal adhesions. Several transcript variants encoding different isoforms of supervillin have been described. |
| DES | 1674 | ENSG00000175084 | desmin | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. |
| MYO1F | 4542 | ENSG00000142347 | myosin IF | NA |
| C10orf54 | 64115 | ENSG00000107738 | chromosome 10 open reading frame 54 | NA |
| CSF3R | 1441 | ENSG00000119535 | colony stimulating factor 3 receptor | The protein encoded by this gene is the receptor for colony stimulating factor 3, a cytokine that controls the production, differentiation, and function of granulocytes. The encoded protein, which is a member of the family of cytokine receptors, may also function in some cell surface adhesion or recognition processes. Alternatively spliced transcript variants have been described. Mutations in this gene are a cause of Kostmann syndrome, also known as severe congenital neutropenia. |
| CKM | 1158 | ENSG00000104879 | creatine kinase, M-type | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis and is an important serum marker for myocardial infarction. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in striated muscle as well as in other tissues, and as a heterodimer with a similar brain isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. |
| MKNK2 | 2872 | ENSG00000099875 | MAP kinase interacting serine/threonine kinase 2 | This gene encodes a member of the calcium/calmodulin-dependent protein kinases (CAMK) Ser/Thr protein kinase family, which belongs to the protein kinase superfamily. This protein contains conserved DLG (asp-leu-gly) and ENIL (glu-asn-ile-leu) motifs, and an N-terminal polybasic region which binds importin A and the translation factor scaffold protein eukaryotic initiation factor 4G (eIF4G). This protein is one of the downstream kinases activated by mitogen-activated protein (MAP) kinases. It phosphorylates the eukaryotic initiation factor 4E (eIF4E), thus playing important roles in the initiation of mRNA translation, oncogenic transformation and malignant cell proliferation. In addition to eIF4E, this protein also interacts with von Hippel-Lindau tumor suppressor (VHL), ring-box 1 (Rbx1) and Cullin2 (Cul2), which are all components of the CBC(VHL) ubiquitin ligase E3 complex. Multiple alternatively spliced transcript variants have been found, but the full-length nature and biological activity of only two variants are determined. These two variants encode distinct isoforms which differ in activity and regulation, and in subcellular localization. |
| COL4A2 | 1284 | ENSG00000134871 | collagen type IV alpha 2 | This gene encodes one of the six subunits of type IV collagen, the major structural component of basement membranes. The C-terminal portion of the protein, known as canstatin, is an inhibitor of angiogenesis and tumor growth. Like the other members of the type IV collagen gene family, this gene is organized in a head-to-head conformation with another type IV collagen gene so that each gene pair shares a common promoter. |
| GAPDH | 2597 | ENSG00000111640 | glyceraldehyde-3-phosphate dehydrogenase | This gene encodes a member of the glyceraldehyde-3-phosphate dehydrogenase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. The product of this gene catalyzes an important energy-yielding step in carbohydrate metabolism, the reversible oxidative phosphorylation of glyceraldehyde-3-phosphate in the presence of inorganic phosphate and nicotinamide adenine dinucleotide (NAD). The encoded protein has additionally been identified to have uracil DNA glycosylase activity in the nucleus. Also, this protein contains a peptide that has antimicrobial activity against E. coli, P. aeruginosa, and C. albicans. Studies of a similar protein in mouse have assigned a variety of additional functions including nitrosylation of nuclear proteins, the regulation of mRNA stability, and acting as a transferrin receptor on the cell surface of macrophage. Many pseudogenes similar to this locus are present in the human genome. Alternative splicing results in multiple transcript variants. |
| TPM3 | 7170 | ENSG00000143549 | tropomyosin 3 | This gene encodes a member of the tropomyosin family of actin-binding proteins. Tropomyosins are dimers of coiled-coil proteins that provide stability to actin filaments and regulate access of other actin-binding proteins. Mutations in this gene result in autosomal dominant nemaline myopathy and other muscle disorders. This locus is involved in translocations with other loci, including anaplastic lymphoma receptor tyrosine kinase (ALK) and neurotrophic tyrosine kinase receptor type 1 (NTRK1), which result in the formation of fusion proteins that act as oncogenes. There are numerous pseudogenes for this gene on different chromosomes. Alternative splicing results in multiple transcript variants. |
| MEDAG | 84935 | ENSG00000102802 | mesenteric estrogen dependent adipogenesis | NA |
| FLOT2 | 2319 | ENSG00000132589 | flotillin 2 | Caveolae are small domains on the inner cell membrane involved in vesicular trafficking and signal transduction. This gene encodes a caveolae-associated, integral membrane protein, which is thought to function in neuronal signaling. |
| FLNC | 2318 | ENSG00000128591 | filamin C | This gene encodes one of three related filamin genes, specifically gamma filamin. These filamin proteins crosslink actin filaments into orthogonal networks in cortical cytoplasm and participate in the anchoring of membrane proteins for the actin cytoskeleton. Three functional domains exist in filamin: an N-terminal filamentous actin-binding domain, a C-terminal self-association domain, and a membrane glycoprotein-binding domain. Two transcript variants encoding different isoforms have been found for this gene. |
| SERPINE1 | 5054 | ENSG00000106366 | serpin family E member 1 | This gene encodes a member of the serine proteinase inhibitor (serpin) superfamily. This member is the principal inhibitor of tissue plasminogen activator (tPA) and urokinase (uPA), and hence is an inhibitor of fibrinolysis. Defects in this gene are the cause of plasminogen activator inhibitor-1 deficiency (PAI-1 deficiency), and high concentrations of the gene product are associated with thrombophilia. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. |
| MYH7 | 4625 | ENSG00000092054 | myosin, heavy chain 7, cardiac muscle, beta | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. |
| COL1A1 | 1277 | ENSG00000108821 | collagen type I alpha 1 | This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIA, Ehlers-Danlos syndrome Classical type, Caffey Disease and idiopathic osteoporosis. Reciprocal translocations between chromosomes 17 and 22, where this gene and the gene for platelet-derived growth factor beta are located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans, resulting from unregulated expression of the growth factor. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. |
| EHBP1L1 | 254102 | ENSG00000173442 | EH domain binding protein 1 like 1 | NA |
| AC019349.5 | ENSG00000229732 | ENSG00000229732 | NA | NA |
| HBA1 | 3039 | ENSG00000206172 | hemoglobin subunit alpha 1 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. |
| CSTA | 1475 | ENSG00000121552 | cystatin A | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins, and kininogens. This gene encodes a stefin that functions as a cysteine protease inhibitor, forming tight complexes with papain and the cathepsins B, H, and L. The protein is one of the precursor proteins of cornified cell envelope in keratinocytes and plays a role in epidermal development and maintenance. Stefins have been proposed as prognostic and diagnostic tools for cancer. |
| IL1RN | 3557 | ENSG00000136689 | interleukin 1 receptor antagonist | The protein encoded by this gene is a member of the interleukin 1 cytokine family. This protein inhibits the activities of interleukin 1, alpha (IL1A) and interleukin 1, beta (IL1B), and modulates a variety of interleukin 1 related immune and inflammatory responses. This gene and five other closely related cytokine genes form a gene cluster spanning approximately 400 kb on chromosome 2. A polymorphism of this gene is reported to be associated with increased risk of osteoporotic fractures and gastric cancer. Several alternatively spliced transcript variants encoding distinct isoforms have been reported. |
| TTN | 7273 | ENSG00000155657 | titin | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. |
| RHOG | 391 | ENSG00000177105 | ras homolog family member G | This gene encodes a member of the Rho family of small GTPases, which cycle between inactive GDP-bound and active GTP-bound states and function as molecular switches in signal transduction cascades. Rho proteins promote reorganization of the actin cytoskeleton and regulate cell shape, attachment, and motility. The encoded protein facilitates translocation of a functional guanine nucleotide exchange factor (GEF) complex from the cytoplasm to the plasma membrane where ras-related C3 botulinum toxin substrate 1 is activated to promote lamellipodium formation and cell migration. Two related pseudogene have been identified on chromosomes 20 and X. |
| SPRR3 | 6707 | ENSG00000163209 | small proline rich protein 3 | NA |
| SPI1 | 6688 | ENSG00000066336 | Spi-1 proto-oncogene | This gene encodes an ETS-domain transcription factor that activates gene expression during myeloid and B-lymphoid cell development. The nuclear protein binds to a purine-rich sequence known as the PU-box found near the promoters of target genes, and regulates their expression in coordination with other transcription factors and cofactors. The protein can also regulate alternative splicing of target genes. Multiple transcript variants encoding different isoforms have been found for this gene. |
| NCF4 | 4689 | ENSG00000100365 | neutrophil cytosolic factor 4 | The protein encoded by this gene is a cytosolic regulatory component of the superoxide-producing phagocyte NADPH-oxidase, a multicomponent enzyme system important for host defense. This protein is preferentially expressed in cells of myeloid lineage. It interacts primarily with neutrophil cytosolic factor 2 (NCF2/p67-phox) to form a complex with neutrophil cytosolic factor 1 (NCF1/p47-phox), which further interacts with the small G protein RAC1 and translocates to the membrane upon cell stimulation. This complex then activates flavocytochrome b, the membrane-integrated catalytic core of the enzyme system. The PX domain of this protein can bind phospholipid products of the PI(3) kinase, which suggests its role in PI(3) kinase-mediated signaling events. The phosphorylation of this protein was found to negatively regulate the enzyme activity. Alternatively spliced transcript variants encoding distinct isoforms have been observed. |
| CSTB | 1476 | ENSG00000160213 | cystatin B | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins and kininogens. This gene encodes a stefin that functions as an intracellular thiol protease inhibitor. The protein is able to form a dimer stabilized by noncovalent forces, inhibiting papain and cathepsins l, h and b. The protein is thought to play a role in protecting against the proteases leaking from lysosomes. Evidence indicates that mutations in this gene are responsible for the primary defects in patients with progressive myoclonic epilepsy (EPM1). |
| ALDOA | 226 | ENSG00000149925 | aldolase, fructose-bisphosphate A | The protein encoded by this gene, Aldolase A (fructose-bisphosphate aldolase), is a glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Three aldolase isozymes (A, B, and C), encoded by three different genes, are differentially expressed during development. Aldolase A is found in the developing embryo and is produced in even greater amounts in adult muscle. Aldolase A expression is repressed in adult liver, kidney and intestine and similar to aldolase C levels in brain and other nervous tissue. Aldolase A deficiency has been associated with myopathy and hemolytic anemia. Alternative splicing and alternative promoter usage results in multiple transcript variants. Related pseudogenes have been identified on chromosomes 3 and 10. |
| KRT7 | 3855 | ENSG00000135480 | keratin 7 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the simple epithelia lining the cavities of the internal organs and in the gland ducts and blood vessels. The genes encoding the type II cytokeratins are clustered in a region of chromosome 12q12-q13. Alternative splicing may result in several transcript variants; however, not all variants have been fully described. |
| A2M | 2 | ENSG00000175899 | alpha-2-macroglobulin | Alpha-2-macroglobulin is a protease inhibitor and cytokine transporter. It inhibits many proteases, including trypsin, thrombin and collagenase. A2M is implicated in Alzheimer disease (AD) due to its ability to mediate the clearance and degradation of A-beta, the major component of beta-amyloid deposits. |
| HP | 3240 | ENSG00000257017 | haptoglobin | This gene encodes a preproprotein, which is processed to yield both alpha and beta chains, which subsequently combine as a tetramer to produce haptoglobin. Haptoglobin functions to bind free plasma hemoglobin, which allows degradative enzymes to gain access to the hemoglobin, while at the same time preventing loss of iron through the kidneys and protecting the kidneys from damage by hemoglobin. Mutations in this gene and/or its regulatory regions cause ahaptoglobinemia or hypohaptoglobinemia. This gene has also been linked to diabetic nephropathy, the incidence of coronary artery disease in type 1 diabetes, Crohn’s disease, inflammatory disease behavior, primary sclerosing cholangitis, susceptibility to idiopathic Parkinson’s disease, and a reduced incidence of Plasmodium falciparum malaria. The protein encoded also exhibits antimicrobial activity against bacteria. A similar duplicated gene is located next to this gene on chromosome 16. Multiple transcript variants encoding different isoforms have been found for this gene. |
| GP2 | 2813 | ENSG00000169347 | glycoprotein 2 | This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. |
| ATG16L2 | 89849 | ENSG00000168010 | autophagy related 16 like 2 | NA |
| TALDO1 | 6888 | ENSG00000177156 | transaldolase 1 | Transaldolase 1 is a key enzyme of the nonoxidative pentose phosphate pathway providing ribose-5-phosphate for nucleic acid synthesis and NADPH for lipid biosynthesis. This pathway can also maintain glutathione at a reduced state and thus protect sulfhydryl groups and cellular integrity from oxygen radicals. The functional gene of transaldolase 1 is located on chromosome 11 and a pseudogene is identified on chromosome 1 but there are conflicting map locations. The second and third exon of this gene were developed by insertion of a retrotransposable element. This gene is thought to be involved in multiple sclerosis. |
| SELPLG | 6404 | ENSG00000110876 | selectin P ligand | This gene encodes a glycoprotein that functions as a high affinity counter-receptor for the cell adhesion molecules P-, E- and L- selectin expressed on myeloid cells and stimulated T lymphocytes. As such, this protein plays a critical role in leukocyte trafficking during inflammation by tethering of leukocytes to activated platelets or endothelia expressing selectins. This protein requires two post-translational modifications, tyrosine sulfation and the addition of the sialyl Lewis x tetrasaccharide (sLex) to its O-linked glycans, for its high-affinity binding activity. Aberrant expression of this gene and polymorphisms in this gene are associated with defects in the innate and adaptive immune response. Alternate splicing results in multiple transcript variants. |
| PYGM | 5837 | ENSG00000068976 | phosphorylase, glycogen, muscle | This gene encodes a muscle enzyme involved in glycogenolysis. Highly similar enzymes encoded by different genes are found in liver and brain. Mutations in this gene are associated with McArdle disease (myophosphorylase deficiency), a glycogen storage disease of muscle. Alternative splicing results in multiple transcript variants. |
| MXD1 | 4084 | ENSG00000059728 | MAX dimerization protein 1 | This gene encodes a member of the MYC/MAX/MAD network of basic helix-loop-helix leucine zipper transcription factors. The MYC/MAX/MAD transcription factors mediate cellular proliferation, differentiation and apoptosis. The encoded protein antagonizes MYC-mediated transcriptional activation of target genes by competing for the binding partner MAX and recruiting repressor complexes containing histone deacetylases. Mutations in this gene may play a role in acute leukemia, and the encoded protein is a potential tumor suppressor. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. |
| REG1B | 5968 | ENSG00000172023 | regenerating family member 1 beta | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV based on the primary structures of the encoded proteins. This gene encodes a protein secreted by the exocrine pancreas that is highly similar to the REG1A protein. The related REG1A protein is associated with islet cell regeneration and diabetogenesis, and may be involved in pancreatic lithogenesis. Reg family members REG1A, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. |
| PACSIN3 | 29763 | ENSG00000165912 | protein kinase C and casein kinase substrate in neurons 3 | This gene is a member of the protein kinase C and casein kinase substrate in neurons family. The encoded protein is involved in linking the actin cytoskeleton with vesicle formation. Alternative splicing results in multiple transcript variants. |
| GPSM3 | 63940 | ENSG00000213654 | G-protein signaling modulator 3 | NA |
| RBM38 | 55544 | ENSG00000132819 | RNA binding motif protein 38 | NA |
| ATP1A1 | 476 | ENSG00000163399 | ATPase Na+/K+ transporting subunit alpha 1 | The protein encoded by this gene belongs to the family of P-type cation transport ATPases, and to the subfamily of Na+/K+ -ATPases. Na+/K+ -ATPase is an integral membrane protein responsible for establishing and maintaining the electrochemical gradients of Na and K ions across the plasma membrane. These gradients are essential for osmoregulation, for sodium-coupled transport of a variety of organic and inorganic molecules, and for electrical excitability of nerve and muscle. This enzyme is composed of two subunits, a large catalytic subunit (alpha) and a smaller glycoprotein subunit (beta). The catalytic subunit of Na+/K+ -ATPase is encoded by multiple genes. This gene encodes an alpha 1 subunit. Multiple transcript variants encoding different isoforms have been found for this gene. |
| ATP2A2 | 488 | ENSG00000174437 | ATPase sarcoplasmic/endoplasmic reticulum Ca2+ transporting 2 | This gene encodes one of the SERCA Ca(2+)-ATPases, which are intracellular pumps located in the sarcoplasmic or endoplasmic reticula of muscle cells. This enzyme catalyzes the hydrolysis of ATP coupled with the translocation of calcium from the cytosol into the sarcoplasmic reticulum lumen, and is involved in regulation of the contraction/relaxation cycle. Mutations in this gene cause Darier-White disease, also known as keratosis follicularis, an autosomal dominant skin disorder characterized by loss of adhesion between epidermal cells and abnormal keratinization. Alternative splicing results in multiple transcript variants encoding different isoforms. |
| TGM2 | 7052 | ENSG00000198959 | transglutaminase 2 | Transglutaminases are enzymes that catalyze the crosslinking of proteins by epsilon-gamma glutamyl lysine isopeptide bonds. While the primary structure of transglutaminases is not conserved, they all have the same amino acid sequence at their active sites and their activity is calcium-dependent. The protein encoded by this gene acts as a monomer, is induced by retinoic acid, and appears to be involved in apoptosis. Finally, the encoded protein is the autoantigen implicated in celiac disease. Two transcript variants encoding different isoforms have been found for this gene. |
| TYROBP | 7305 | ENSG00000011600 | TYRO protein tyrosine kinase binding protein | This gene encodes a transmembrane signaling polypeptide which contains an immunoreceptor tyrosine-based activation motif (ITAM) in its cytoplasmic domain. The encoded protein may associate with the killer-cell inhibitory receptor (KIR) family of membrane glycoproteins and may act as an activating signal transduction element. This protein may bind zeta-chain (TCR) associated protein kinase 70kDa (ZAP-70) and spleen tyrosine kinase (SYK) and play a role in signal transduction, bone modeling, brain myelination, and inflammation. Mutations within this gene have been associated with polycystic lipomembranous osteodysplasia with sclerosing leukoencephalopathy (PLOSL), also known as Nasu-Hakola disease. Its putative receptor, triggering receptor expressed on myeloid cells 2 (TREM2), also causes PLOSL. Multiple alternative transcript variants encoding distinct isoforms have been identified for this gene. |
| RAB10 | 10890 | ENSG00000084733 | RAB10, member RAS oncogene family | RAB10 belongs to the RAS (see HRAS; MIM 190020) superfamily of small GTPases. RAB proteins localize to exocytic and endocytic compartments and regulate intracellular vesicle trafficking (Bao et al., 1998 [PubMed 9918381]). |
| SPINK1 | 6690 | ENSG00000164266 | serine peptidase inhibitor, Kazal type 1 | The protein encoded by this gene is a trypsin inhibitor, which is secreted from pancreatic acinar cells into pancreatic juice. It is thought to function in the prevention of trypsin-catalyzed premature activation of zymogens within the pancreas and the pancreatic duct. Mutations in this gene are associated with hereditary pancreatitis and tropical calcific pancreatitis. |
| PI3 | 5266 | ENSG00000124102 | peptidase inhibitor 3 | This gene encodes an elastase-specific inhibitor that functions as an antimicrobial peptide against Gram-positive and Gram-negative bacteria, and fungal pathogens. The protein contains a WAP-type four-disulfide core (WFDC) domain, and is thus a member of the WFDC domain family. Most WFDC gene members are localized to chromosome 20q12-q13 in two clusters: centromeric and telomeric. This gene belongs to the centromeric cluster. Expression of this gene is upgulated by bacterial lipopolysaccharides and cytokines. |
| CYP17A1 | 1586 | ENSG00000148795 | cytochrome P450 family 17 subfamily A member 1 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum. It has both 17alpha-hydroxylase and 17,20-lyase activities and is a key enzyme in the steroidogenic pathway that produces progestins, mineralocorticoids, glucocorticoids, androgens, and estrogens. Mutations in this gene are associated with isolated steroid-17 alpha-hydroxylase deficiency, 17-alpha-hydroxylase/17,20-lyase deficiency, pseudohermaphroditism, and adrenal hyperplasia. |
| HSPB1 | 3315 | ENSG00000106211 | heat shock protein family B (small) member 1 | The protein encoded by this gene is induced by environmental stress and developmental changes. The encoded protein is involved in stress resistance and actin organization and translocates from the cytoplasm to the nucleus upon stress induction. Defects in this gene are a cause of Charcot-Marie-Tooth disease type 2F (CMT2F) and distal hereditary motor neuropathy (dHMN). |
| UNC13D | 201294 | ENSG00000092929 | unc-13 homolog D | This gene encodes a protein that is a member of the UNC13 family, containing similar domain structure as other family members but lacking an N-terminal phorbol ester-binding C1 domain present in other Munc13 proteins. The protein appears to play a role in vesicle maturation during exocytosis and is involved in regulation of cytolytic granules secretion. Mutations in this gene are associated with familial hemophagocytic lymphohistiocytosis type 3, a genetically heterogeneous, rare autosomal recessive disorder. |
| MYBPC1 | 4604 | ENSG00000196091 | myosin binding protein C, slow type | This gene encodes a member of the myosin-binding protein C family. Myosin-binding protein C family members are myosin-associated proteins found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The encoded protein is the slow skeletal muscle isoform of myosin-binding protein C and plays an important role in muscle contraction by recruiting muscle-type creatine kinase to myosin filaments. Mutations in this gene are associated with distal arthrogryposis type I. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. |
| MYL3 | 4634 | ENSG00000160808 | myosin light chain 3 | MYL3 encodes myosin light chain 3, an alkali light chain also referred to in the literature as both the ventricular isoform and the slow skeletal muscle isoform. Mutations in MYL3 have been identified as a cause of mid-left ventricular chamber type hypertrophic cardiomyopathy. |
| PRSS1 | 5644 | ENSG00000204983 | protease, serine 1 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. |
| HCK | 3055 | ENSG00000101336 | HCK proto-oncogene, Src family tyrosine kinase | The protein encoded by this gene is a member of the Src family of tyrosine kinases. This protein is primarily hemopoietic, particularly in cells of the myeloid and B-lymphoid lineages. It may help couple the Fc receptor to the activation of the respiratory burst. In addition, it may play a role in neutrophil migration and in the degranulation of neutrophils. Multiple isoforms with different subcellular distributions are produced due to both alternative splicing and the use of alternative translation initiation codons, including a non-AUG (CUG) codon. |
| ABTB1 | 80325 | ENSG00000114626 | ankyrin repeat and BTB domain containing 1 | This gene encodes a protein with an ankyrin repeat region and two BTB/POZ domains, which are thought to be involved in protein-protein interactions. Expression of this gene is activated by the phosphatase and tensin homolog, a tumor suppressor. Alternate splicing results in three transcript variants. |
| CPA1 | 1357 | ENSG00000091704 | carboxypeptidase A1 | This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. |
| GPX3 | 2878 | ENSG00000211445 | glutathione peroxidase 3 | This gene product belongs to the glutathione peroxidase family, which functions in the detoxification of hydrogen peroxide. It contains a selenocysteine (Sec) residue at its active site. The selenocysteine is encoded by the UGA codon, which normally signals translation termination. The 3’ UTR of Sec-containing genes have a common stem-loop structure, the sec insertion sequence (SECIS), which is necessary for the recognition of UGA as a Sec codon rather than as a stop signal. |
| PTGDS | 5730 | ENSG00000107317 | prostaglandin D2 synthase | The protein encoded by this gene is a glutathione-independent prostaglandin D synthase that catalyzes the conversion of prostaglandin H2 (PGH2) to postaglandin D2 (PGD2). PGD2 functions as a neuromodulator as well as a trophic factor in the central nervous system. PGD2 is also involved in smooth muscle contraction/relaxation and is a potent inhibitor of platelet aggregation. This gene is preferentially expressed in brain. Studies with transgenic mice overexpressing this gene suggest that this gene may be also involved in the regulation of non-rapid eye movement sleep. |
| SYNM | 23336 | ENSG00000182253 | synemin | The protein encoded by this gene is an intermediate filament (IF) family member. IF proteins are cytoskeletal proteins that confer resistance to mechanical stress and are encoded by a dispersed multigene family. This protein has been found to form a linkage between desmin, which is a subunit of the IF network, and the extracellular matrix, and provides an important structural support in muscle. Two alternatively spliced variants encoding different isoforms have been described for this gene. |
| NEB | 4703 | ENSG00000183091 | nebulin | This gene encodes nebulin, a giant protein component of the cytoskeletal matrix that coexists with the thick and thin filaments within the sarcomeres of skeletal muscle. In most vertebrates, nebulin accounts for 3 to 4% of the total myofibrillar protein. The encoded protein contains approximately 30-amino acid long modules that can be classified into 7 types and other repeated modules. Protein isoform sizes vary from 600 to 800 kD due to alternative splicing that is tissue-, species-,and developmental stage-specific. Of the 183 exons in the nebulin gene, at least 43 are alternatively spliced, although exons 143 and 144 are not found in the same transcript. Of the several thousand transcript variants predicted for nebulin, the RefSeq Project has decided to create three representative RefSeq records. Mutations in this gene are associated with recessive nemaline myopathy. |
| TNNI1 | 7135 | ENSG00000159173 | troponin I1, slow skeletal type | Troponin proteins associate with tropomyosin and regulate the calcium sensitivity of the myofibril contractile apparatus of striated muscles. Troponin I (TnI), along with troponin T (TnT) and troponin C (TnC), is one of 3 subunits that form the troponin complex of the thin filaments of striated muscle. TnI is the inhibitory subunit; blocking actin-myosin interactions and thereby mediating striated muscle relaxation. The TnI subfamily contains three genes: TnI-skeletal-fast-twitch, TnI-skeletal-slow-twitch, and TnI-cardiac. The TnI-fast and TnI-slow genes are expressed in fast-twitch and slow-twitch skeletal muscle fibers, respectively, while the TnI-cardiac gene is expressed exclusively in cardiac muscle tissue. This gene encodes the Troponin-I-skeletal-slow-twitch protein. This gene is expressed in cardiac and skeletal muscle during early development but is restricted to slow-twitch skeletal muscle fibers in adults. The encoded protein prevents muscle contraction by inhibiting calcium-mediated conformational changes in actin-myosin complexes. |
| CRNN | 49860 | ENSG00000143536 | cornulin | This gene encodes a member of the ‘fused gene’ family of proteins, which contain N-terminus EF-hand domains and multiple tandem peptide repeats. The encoded protein contains two EF-hand Ca2+ binding domains in its N-terminus and two glutamine- and threonine-rich 60 amino acid repeats in its C-terminus. This gene, also known as squamous epithelial heat shock protein 53, may play a role in the mucosal/epithelial immune response and epidermal differentiation. |
| CRIM1 | 51232 | ENSG00000150938 | cysteine rich transmembrane BMP regulator 1 (chordin-like) | This gene encodes a transmembrane protein containing six cysteine-rich repeat domains and an insulin-like growth factor-binding domain. The encoded protein may play a role in tissue development though interactions with members of the transforming growth factor beta family, such as bone morphogenetic proteins. |
| CELA3A | 10136 | ENSG00000142789 | chymotrypsin like elastase family member 3A | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. |
| APP | 351 | ENSG00000142192 | amyloid beta precursor protein | This gene encodes a cell surface receptor and transmembrane precursor protein that is cleaved by secretases to form a number of peptides. Some of these peptides are secreted and can bind to the acetyltransferase complex APBB1/TIP60 to promote transcriptional activation, while others form the protein basis of the amyloid plaques found in the brains of patients with Alzheimer disease. In addition, two of the peptides are antimicrobial peptides, having been shown to have bacteriocidal and antifungal activities. Mutations in this gene have been implicated in autosomal dominant Alzheimer disease and cerebroarterial amyloidosis (cerebral amyloid angiopathy). Multiple transcript variants encoding several different isoforms have been found for this gene. |
| FGA | 2243 | ENSG00000171560 | fibrinogen alpha chain | This gene encodes the alpha subunit of the coagulation factor fibrinogen, which is a component of the blood clot. Following vascular injury, the encoded preproprotein is proteolytically processed by thrombin during the conversion of fibrinogen to fibrin. Mutations in this gene lead to several disorders, including dysfibrinogenemia, hypofibrinogenemia, afibrinogenemia and renal amyloidosis. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. |
| SYNPO2 | 171024 | ENSG00000172403 | synaptopodin 2 | NA |
| HADHA | 3030 | ENSG00000084754 | hydroxyacyl-CoA dehydrogenase/3-ketoacyl-CoA thiolase/enoyl-CoA hydratase (trifunctional protein), alpha subunit | This gene encodes the alpha subunit of the mitochondrial trifunctional protein, which catalyzes the last three steps of mitochondrial beta-oxidation of long chain fatty acids. The mitochondrial membrane-bound heterocomplex is composed of four alpha and four beta subunits, with the alpha subunit catalyzing the 3-hydroxyacyl-CoA dehydrogenase and enoyl-CoA hydratase activities. Mutations in this gene result in trifunctional protein deficiency or LCHAD deficiency. The genes of the alpha and beta subunits of the mitochondrial trifunctional protein are located adjacent to each other in the human genome in a head-to-head orientation. |
| CPA2 | 1358 | ENSG00000158516 | carboxypeptidase A2 | Three different forms of human pancreatic procarboxypeptidase A have been isolated. The encoded protein represents the A2 form, which is a monomeric protein with different biochemical properties from the A1 and A3 forms. The A2 form of pancreatic procarboxypeptidase acts on aromatic C-terminal residues and is a secreted protein. |
| GFPT1 | 2673 | ENSG00000198380 | glutamine–fructose-6-phosphate transaminase 1 | This gene encodes the first and rate-limiting enzyme of the hexosamine pathway and controls the flux of glucose into the hexosamine pathway. The product of this gene catalyzes the formation of glucosamine 6-phosphate. |
| KRT15 | 3866 | ENSG00000171346 | keratin 15 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains and are clustered in a region on chromosome 17q21.2. |
| MB | 4151 | ENSG00000198125 | myoglobin | This gene encodes a member of the globin superfamily and is expressed in skeletal and cardiac muscles. The encoded protein is a haemoprotein contributing to intracellular oxygen storage and transcellular facilitated diffusion of oxygen. At least three alternatively spliced transcript variants encoding the same protein have been reported. |
| FCER1G | 2207 | ENSG00000158869 | Fc fragment of IgE receptor Ig | The high affinity IgE receptor is a key molecule involved in allergic reactions. It is a tetramer composed of 1 alpha, 1 beta, and 2 gamma chains. The gamma chains are also subunits of other Fc receptors. |
| YBX3 | 8531 | ENSG00000060138 | Y-box binding protein 3 | NA |
| ARHGAP9 | 64333 | ENSG00000123329 | Rho GTPase activating protein 9 | This gene encodes a member of the Rho-GAP family of GTPase activating proteins. The protein has substantial GAP activity towards several Rho-family GTPases in vitro, converting them to an inactive GDP-bound state. It is implicated in regulating adhesion of hematopoietic cells to the extracellular matrix. Multiple transcript variants encoding different isoforms have been found for this gene. |
| MYOZ1 | 58529 | ENSG00000177791 | myozenin 1 | The protein encoded by this gene is primarily expressed in the skeletal muscle, and belongs to the myozenin family. Members of this family function as calcineurin-interacting proteins that help tether calcineurin to the sarcomere of cardiac and skeletal muscle. They play an important role in modulation of calcineurin signaling. |
| CSK | 1445 | ENSG00000103653 | c-src tyrosine kinase | NA |
| PTPRG | 5793 | ENSG00000144724 | protein tyrosine phosphatase, receptor type G | The protein encoded by this gene is a member of the protein tyrosine phosphatase (PTP) family. PTPs are known to be signaling molecules that regulate a variety of cellular processes including cell growth, differentiation, mitotic cycle, and oncogenic transformation. This PTP possesses an extracellular region, a single transmembrane region, and two tandem intracytoplasmic catalytic domains, and thus represents a receptor-type PTP. The extracellular region of this PTP contains a carbonic anhydrase-like (CAH) domain, which is also found in the extracellular region of PTPRBETA/ZETA. This gene is located in a chromosomal region that is frequently deleted in renal cell carcinoma and lung carcinoma, thus is thought to be a candidate tumor suppressor gene. |
| PRSS3 | 5646 | ENSG00000010438 | protease, serine 3 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is expressed in the brain and pancreas and is resistant to common trypsin inhibitors. It is active on peptide linkages involving the carboxyl group of lysine or arginine. This gene is localized to the locus of T cell receptor beta variable orphans on chromosome 9. Four transcript variants encoding different isoforms have been described for this gene. |
| ACSL1 | 2180 | ENSG00000151726 | acyl-CoA synthetase long-chain family member 1 | The protein encoded by this gene is an isozyme of the long-chain fatty-acid-coenzyme A ligase family. Although differing in substrate specificity, subcellular localization, and tissue distribution, all isozymes of this family convert free long-chain fatty acids into fatty acyl-CoA esters, and thereby play a key role in lipid biosynthesis and fatty acid degradation. Several transcript variants encoding different isoforms have been found for this gene. |
| B3GNT8 | 374907 | ENSG00000177191 | UDP-GlcNAc:betaGal beta-1,3-N-acetylglucosaminyltransferase 8 | NA |
| ARHGAP27 | 201176 | ENSG00000159314 | Rho GTPase activating protein 27 | This gene encodes a member of a large family of proteins that activate Rho-type guanosine triphosphate (GTP) metabolizing enzymes. The encoded protein may pay a role in clathrin-mediated endocytosis. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. |
| ADCK3 | 56997 | ENSG00000163050 | aarF domain containing kinase 3 | This gene encodes a mitochondrial protein similar to yeast ABC1, which functions in an electron-transferring membrane protein complex in the respiratory chain. It is not related to the family of ABC transporter proteins. Expression of this gene is induced by the tumor suppressor p53 and in response to DNA damage, and inhibiting its expression partially suppresses p53-induced apoptosis. Alternatively spliced transcript variants have been found; however, their full-length nature has not been determined. |
| MYH6 | 4624 | ENSG00000197616 | myosin, heavy chain 6, cardiac muscle, alpha | Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. |
| IL1R2 | 7850 | ENSG00000115590 | interleukin 1 receptor type 2 | The protein encoded by this gene is a cytokine receptor that belongs to the interleukin 1 receptor family. This protein binds interleukin alpha (IL1A), interleukin beta (IL1B), and interleukin 1 receptor, type I(IL1R1/IL1RA), and acts as a decoy receptor that inhibits the activity of its ligands. Interleukin 4 (IL4) is reported to antagonize the activity of interleukin 1 by inducing the expression and release of this cytokine. This gene and three other genes form a cytokine receptor gene cluster on chromosome 2q12. Alternative splicing results in multiple transcript variants and protein isoforms. Alternative splicing produces both membrane-bound and soluble proteins. A soluble protein is also produced by proteolytic cleavage. |
| TG | 7038 | ENSG00000042832 | thyroglobulin | Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. |
| ADIRF | 10974 | ENSG00000148671 | adipogenesis regulatory factor | APM2 gene is exclusively expressed in adipose tissue. Its function is currently unknown. |
| RAB5B | 5869 | ENSG00000111540 | RAB5B, member RAS oncogene family | NA |
| MMP25 | 64386 | ENSG00000008516 | matrix metallopeptidase 25 | Proteins of the matrix metalloproteinase (MMP) family are involved in the breakdown of extracellular matrix in normal physiological processes, such as embryonic development, reproduction, and tissue remodeling, as well as in disease processes, such as arthritis and metastasis. Most MMPs are secreted as inactive proproteins which are activated when cleaved by extracellular proteinases. However, the protein encoded by this gene is a member of the membrane-type MMP (MT-MMP) subfamily, attached to the plasma membrane via a glycosylphosphatidyl inositol anchor. In response to bacterial infection or inflammation, the encoded protein is thought to inactivate alpha-1 proteinase inhibitor, a major tissue protectant against proteolytic enzymes released by activated neutrophils, facilitating the transendothelial migration of neutrophils to inflammatory sites. The encoded protein may also play a role in tumor invasion and metastasis through activation of MMP2. The gene has previously been referred to as MMP20 but has been renamed MMP25. |
| NLRX1 | 79671 | ENSG00000160703 | NLR family member X1 | The protein encoded by this gene is a member of the NLR family and localizes to the outer mitochondrial membrane. The encoded protein is a regulator of mitochondrial antivirus responses. Three transcript variants encoding the same protein have been found for this gene. |
| EPAS1 | 2034 | ENSG00000116016 | endothelial PAS domain protein 1 | This gene encodes a transcription factor involved in the induction of genes regulated by oxygen, which is induced as oxygen levels fall. The encoded protein contains a basic-helix-loop-helix domain protein dimerization domain as well as a domain found in proteins in signal transduction pathways which respond to oxygen levels. Mutations in this gene are associated with erythrocytosis familial type 4. |
| HK3 | 3101 | ENSG00000160883 | hexokinase 3 | Hexokinases phosphorylate glucose to produce glucose-6-phosphate, the first step in most glucose metabolism pathways. This gene encodes hexokinase 3. Similar to hexokinases 1 and 2, this allosteric enzyme is inhibited by its product glucose-6-phosphate. |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",14,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[15,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
kable(as.data.frame(out))
| name | X_id | summary | symbol | query |
|---|---|---|---|---|
| thyroglobulin | 7038 | Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. | TG | ENSG00000042832 |
| nuclear paraspeckle assembly transcript 1 (non-protein coding) | 283131 | This gene produces a long non-coding RNA (lncRNA) transcribed from the multiple endocrine neoplasia locus. This lncRNA is retained in the nucleus where it forms the core structural component of the paraspeckle sub-organelles. It may act as a transcriptional regulator for numerous genes, including some genes involved in cancer progression. | NEAT1 | ENSG00000245532 |
| thyroid peroxidase | 7173 | This gene encodes a membrane-bound glycoprotein. The encoded protein acts as an enzyme and plays a central role in thyroid gland function. The protein functions in the iodination of tyrosine residues in thyroglobulin and phenoxy-ester formation between pairs of iodinated tyrosines to generate the thyroid hormones, thyroxine and triiodothyronine. Mutations in this gene are associated with several disorders of thyroid hormonogenesis, including congenital hypothyroidism, congenital goiter, and thyroid hormone organification defect IIA. Multiple transcript variants encoding distinct isoforms have been identified for this gene, but the full-length nature of some variants has not been determined. | TPO | ENSG00000115705 |
| paired box 8 | 7849 | This gene encodes a member of the paired box (PAX) family of transcription factors. Members of this gene family typically encode proteins that contain a paired box domain, an octapeptide, and a paired-type homeodomain. This nuclear protein is involved in thyroid follicular cell development and expression of thyroid-specific genes. Mutations in this gene have been associated with thyroid dysgenesis, thyroid follicular carcinomas and atypical follicular thyroid adenomas. Alternatively spliced transcript variants encoding different isoforms have been described. | PAX8 | ENSG00000125618 |
| surfactant protein B | 6439 | This gene encodes the pulmonary-associated surfactant protein B (SPB), an amphipathic surfactant protein essential for lung function and homeostasis after birth. Pulmonary surfactant is a surface-active lipoprotein complex composed of 90% lipids and 10% proteins which include plasma proteins and apolipoproteins SPA, SPB, SPC and SPD. The surfactant is secreted by the alveolar cells of the lung and maintains the stability of pulmonary tissue by reducing the surface tension of fluids that coat the lung. The SPB enhances the rate of spreading and increases the stability of surfactant monolayers in vitro. Multiple mutations in this gene have been identified, which cause pulmonary surfactant metabolism dysfunction type 1, also called pulmonary alveolar proteinosis due to surfactant protein B deficiency, and are associated with fatal respiratory distress in the neonatal period. Alternatively spliced transcript variants encoding the same protein have been identified. | SFTPB | ENSG00000168878 |
| poly(A) binding protein cytoplasmic 1 | 26986 | This gene encodes a poly(A) binding protein. The protein shuttles between the nucleus and cytoplasm and binds to the 3’ poly(A) tail of eukaryotic messenger RNAs via RNA-recognition motifs. The binding of this protein to poly(A) promotes ribosome recruitment and translation initiation; it is also required for poly(A) shortening which is the first step in mRNA decay. The gene is part of a small gene family including three protein-coding genes and several pseudogenes. | PABPC1 | ENSG00000070756 |
| keratin 5 | 3852 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the basal layer of the epidermis with family member KRT14. Mutations in these genes have been associated with a complex of diseases termed epidermolysis bullosa simplex. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | KRT5 | ENSG00000186081 |
| keratin 7 | 3855 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the simple epithelia lining the cavities of the internal organs and in the gland ducts and blood vessels. The genes encoding the type II cytokeratins are clustered in a region of chromosome 12q12-q13. Alternative splicing may result in several transcript variants; however, not all variants have been fully described. | KRT7 | ENSG00000135480 |
| collagen type IV alpha 3 chain | 1285 | Type IV collagen, the major structural component of basement membranes, is a multimeric protein composed of 3 alpha subunits. These subunits are encoded by 6 different genes, alpha 1 through alpha 6, each of which can form a triple helix structure with 2 other subunits to form type IV collagen. This gene encodes alpha 3. In the Goodpasture syndrome, autoantibodies bind to the collagen molecules in the basement membranes of alveoli and glomeruli. The epitopes that elicit these autoantibodies are localized largely to the non-collagenous C-terminal domain of the protein. A specific kinase phosphorylates amino acids in this same C-terminal region and the expression of this kinase is upregulated during pathogenesis. This gene is also linked to an autosomal recessive form of Alport syndrome. The mutations contributing to this syndrome are also located within the exons that encode this C-terminal region. Like the other members of the type IV collagen gene family, this gene is organized in a head-to-head conformation with another type IV collagen gene so that each gene pair shares a common promoter. | COL4A3 | ENSG00000169031 |
| alpha-2-macroglobulin | 2 | Alpha-2-macroglobulin is a protease inhibitor and cytokine transporter. It inhibits many proteases, including trypsin, thrombin and collagenase. A2M is implicated in Alzheimer disease (AD) due to its ability to mediate the clearance and degradation of A-beta, the major component of beta-amyloid deposits. | A2M | ENSG00000175899 |
| titin | 7273 | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | TTN | ENSG00000155657 |
| ZFP36 ring finger protein | 7538 | NA | ZFP36 | ENSG00000128016 |
| creatine kinase, M-type | 1158 | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis and is an important serum marker for myocardial infarction. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in striated muscle as well as in other tissues, and as a heterodimer with a similar brain isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. | CKM | ENSG00000104879 |
| nephronectin | 255743 | NA | NPNT | ENSG00000168743 |
| collagen type IV alpha 4 chain | 1286 | This gene encodes one of the six subunits of type IV collagen, the major structural component of basement membranes. This particular collagen IV subunit, however, is only found in a subset of basement membranes. Like the other members of the type IV collagen gene family, this gene is organized in a head-to-head conformation with another type IV collagen gene so that each gene pair shares a common promoter. Mutations in this gene are associated with type II autosomal recessive Alport syndrome (hereditary glomerulonephropathy) and with familial benign hematuria (thin basement membrane disease). Two transcripts, differing only in their transcription start sites, have been identified for this gene and, as is common for collagen genes, multiple polyadenylation sites are found in the 3’ UTR. | COL4A4 | ENSG00000081052 |
| lipase G, endothelial type | 9388 | The protein encoded by this gene has substantial phospholipase activity and may be involved in lipoprotein metabolism and vascular biology. This protein is designated a member of the TG lipase family by its sequence and characteristic lid region which provides substrate specificity for enzymes of the TG lipase family. | LIPG | ENSG00000101670 |
| myosin, heavy chain 6, cardiac muscle, alpha | 4624 | Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. | MYH6 | ENSG00000197616 |
| inositol polyphosphate-5-phosphatase J | 27124 | NA | INPP5J | ENSG00000185133 |
| eukaryotic translation elongation factor 1 alpha 1 | 1915 | This gene encodes an isoform of the alpha subunit of the elongation factor-1 complex, which is responsible for the enzymatic delivery of aminoacyl tRNAs to the ribosome. This isoform (alpha 1) is expressed in brain, placenta, lung, liver, kidney, and pancreas, and the other isoform (alpha 2) is expressed in brain, heart and skeletal muscle. This isoform is identified as an autoantigen in 66% of patients with Felty syndrome. This gene has been found to have multiple copies on many chromosomes, some of which, if not all, represent different pseudogenes. | EEF1A1 | ENSG00000156508 |
| actin, beta | 60 | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | ACTB | ENSG00000075624 |
| small nucleolar RNA host gene 14 | ENSG00000224078 | NA | SNHG14 | ENSG00000224078 |
| carboxypeptidase B1 | 1360 | Three different procarboxypeptidases A and two different procarboxypeptidases B have been isolated. The B1 and B2 forms differ from each other mainly in isoelectric point. Carboxypeptidase B1 is a highly tissue-specific protein and is a useful serum marker for acute pancreatitis and dysfunction of pancreatic transplants. It is not elevated in pancreatic carcinoma. | CPB1 | ENSG00000153002 |
| serpin family E member 1 | 5054 | This gene encodes a member of the serine proteinase inhibitor (serpin) superfamily. This member is the principal inhibitor of tissue plasminogen activator (tPA) and urokinase (uPA), and hence is an inhibitor of fibrinolysis. Defects in this gene are the cause of plasminogen activator inhibitor-1 deficiency (PAI-1 deficiency), and high concentrations of the gene product are associated with thrombophilia. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | SERPINE1 | ENSG00000106366 |
| phosphatidylethanolamine binding protein 4 | 157310 | The phosphatidylethanolamine (PE)-binding proteins, including PEBP4, are an evolutionarily conserved family of proteins with pivotal biologic functions, such as lipid binding and inhibition of serine proteases (Wang et al., 2004 [PubMed 15302887]). | PEBP4 | ENSG00000134020 |
| cardiomyopathy associated 5 | 202333 | NA | CMYA5 | ENSG00000164309 |
| latent transforming growth factor beta binding protein 2 | 4053 | The protein encoded by this gene belongs to the family of latent transforming growth factor (TGF)-beta binding proteins (LTBP), which are extracellular matrix proteins with multi-domain structure. This protein is the largest member of the LTBP family possessing unique regions and with most similarity to the fibrillins. It has thus been suggested that it may have multiple functions: as a member of the TGF-beta latent complex, as a structural component of microfibrils, and a role in cell adhesion. | LTBP2 | ENSG00000119681 |
| myosin light chain 6 | 4637 | Myosin is a hexameric ATPase cellular motor protein. It is composed of two heavy chains, two nonphosphorylatable alkali light chains, and two phosphorylatable regulatory light chains. This gene encodes a myosin alkali light chain that is expressed in smooth muscle and non-muscle tissues. Genomic sequences representing several pseudogenes have been described and two transcript variants encoding different isoforms have been identified for this gene. | MYL6 | ENSG00000092841 |
| ral guanine nucleotide dissociation stimulator like 3 | 57139 | NA | RGL3 | ENSG00000205517 |
| complement factor D | 1675 | This gene encodes a member of the S1, or chymotrypsin, family of serine peptidases. This protease catalyzes the cleavage of factor B, the rate-limiting step of the alternative pathway of complement activation. This protein also functions as an adipokine, a cell signaling protein secreted by adipocytes, which regulates insulin secretion in mice. Mutations in this gene underlie complement factor D deficiency, which is associated with recurrent bacterial meningitis infections in human patients. Alternative splicing of this gene results in multiple transcript variants. At least one of these variants encodes a preproprotein that is proteolytically processed to generate the mature protease. | CFD | ENSG00000197766 |
| dedicator of cytokinesis 5 | 80005 | NA | DOCK5 | ENSG00000147459 |
| collagen type I alpha 2 chain | 1278 | This gene encodes the pro-alpha2 chain of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIB, recessive Ehlers-Danlos syndrome Classical type, idiopathic osteoporosis, and atypical Marfan syndrome. Symptoms associated with mutations in this gene, however, tend to be less severe than mutations in the gene for the alpha1 chain of type I collagen (COL1A1) reflecting the different role of alpha2 chains in matrix integrity. Three transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | COL1A2 | ENSG00000164692 |
| erythropoietin receptor | 2057 | This gene encodes the erythropoietin receptor which is a member of the cytokine receptor family. Upon erythropoietin binding, this receptor activates Jak2 tyrosine kinase which activates different intracellular pathways including: Ras/MAP kinase, phosphatidylinositol 3-kinase and STAT transcription factors. The stimulated erythropoietin receptor appears to have a role in erythroid cell survival. Defects in the erythropoietin receptor may produce erythroleukemia and familial erythrocytosis. Dysregulation of this gene may affect the growth of certain tumors. Alternate splicing results in multiple transcript variants. | EPOR | ENSG00000187266 |
| crystallin alpha B | 1410 | Mammalian lens crystallins are divided into alpha, beta, and gamma families. Alpha crystallins are composed of two gene products: alpha-A and alpha-B, for acidic and basic, respectively. Alpha crystallins can be induced by heat shock and are members of the small heat shock protein (HSP20) family. They act as molecular chaperones although they do not renature proteins and release them in the fashion of a true chaperone; instead they hold them in large soluble aggregates. Post-translational modifications decrease the ability to chaperone. These heterogeneous aggregates consist of 30-40 subunits; the alpha-A and alpha-B subunits have a 3:1 ratio, respectively. Two additional functions of alpha crystallins are an autokinase activity and participation in the intracellular architecture. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. Alpha-A and alpha-B gene products are differentially expressed; alpha-A is preferentially restricted to the lens and alpha-B is expressed widely in many tissues and organs. Elevated expression of alpha-B crystallin occurs in many neurological diseases; a missense mutation cosegregated in a family with a desmin-related myopathy. Alternative splicing results in multiple transcript variants. | CRYAB | ENSG00000109846 |
| collagen type XXIII alpha 1 chain | 91522 | COL23A1 is a member of the transmembrane collagens, a subfamily of the nonfibrillar collagens that contain a single pass hydrophobic transmembrane domain (Banyard et al., 2003 [PubMed 12644459]). | COL23A1 | ENSG00000050767 |
| keratin 6A | 3853 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. As many as six of this type II cytokeratin (KRT6) have been identified; the multiplicity of the genes is attributed to successive gene duplication events. The genes are expressed with family members KRT16 and/or KRT17 in the filiform papillae of the tongue, the stratified epithelial lining of oral mucosa and esophagus, the outer root sheath of hair follicles, and the glandular epithelia. This KRT6 gene in particular encodes the most abundant isoform. Mutations in these genes have been associated with pachyonychia congenita. In addition, peptides from the C-terminal region of the protein have antimicrobial activity against bacterial pathogens. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | KRT6A | ENSG00000205420 |
| chromosome 1 open reading frame 198 | 84886 | NA | C1orf198 | ENSG00000119280 |
| advanced glycosylation end product-specific receptor | 177 | The advanced glycosylation end product (AGE) receptor encoded by this gene is a member of the immunoglobulin superfamily of cell surface receptors. It is a multiligand receptor, and besides AGE, interacts with other molecules implicated in homeostasis, development, and inflammation, and certain diseases, such as diabetes and Alzheimer’s disease. Many alternatively spliced transcript variants encoding different isoforms, as well as non-protein-coding variants, have been described for this gene (PMID:18089847). | AGER | ENSG00000204305 |
| glycoprotein 2 | 2813 | This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. | GP2 | ENSG00000169347 |
| complement component 7 | 730 | C7 is a component of the complement system. It participates in the formation of Membrane Attack Complex (MAC). People with C7 deficiency are prone to bacterial infection. | C7 | ENSG00000112936 |
| ring finger protein 144B | 255488 | NA | RNF144B | ENSG00000137393 |
| carboxypeptidase A1 | 1357 | This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. | CPA1 | ENSG00000091704 |
| metallothionein 2A | 4502 | NA | MT2A | ENSG00000125148 |
| glycerol-3-phosphate dehydrogenase 1 | 2819 | This gene encodes a member of the NAD-dependent glycerol-3-phosphate dehydrogenase family. The encoded protein plays a critical role in carbohydrate and lipid metabolism by catalyzing the reversible conversion of dihydroxyacetone phosphate (DHAP) and reduced nicotine adenine dinucleotide (NADH) to glycerol-3-phosphate (G3P) and NAD+. The encoded cytosolic protein and mitochondrial glycerol-3-phosphate dehydrogenase also form a glycerol phosphate shuttle that facilitates the transfer of reducing equivalents from the cytosol to mitochondria. Mutations in this gene are a cause of transient infantile hypertriglyceridemia. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | GPD1 | ENSG00000167588 |
| phosphoenolpyruvate carboxykinase 1 | 5105 | This gene is a main control point for the regulation of gluconeogenesis. The cytosolic enzyme encoded by this gene, along with GTP, catalyzes the formation of phosphoenolpyruvate from oxaloacetate, with the release of carbon dioxide and GDP. The expression of this gene can be regulated by insulin, glucocorticoids, glucagon, cAMP, and diet. Defects in this gene are a cause of cytosolic phosphoenolpyruvate carboxykinase deficiency. A mitochondrial isozyme of the encoded protein also has been characterized. | PCK1 | ENSG00000124253 |
| LIM domain 7 | 4008 | This gene encodes a protein containing a calponin homology (CH) domain, a PDZ domain, and a LIM domain, and may be involved in protein-protein interactions. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene, however, the full-length nature of some variants is not known. | LMO7 | ENSG00000136153 |
| integrin subunit alpha 3 | 3675 | The gene encodes a member of the integrin alpha chain family of proteins. Integrins are heterodimeric integral membrane proteins composed of an alpha chain and a beta chain that function as cell surface adhesion molecules. The encoded preproprotein is proteolytically processed to generate light and heavy chains that comprise the alpha 3 subunit. This subunit joins with a beta 1 subunit to form an integrin that interacts with extracellular matrix proteins including members of the laminin family. Expression of this gene may be correlated with breast cancer metastasis. | ITGA3 | ENSG00000005884 |
| H19, imprinted maternally expressed transcript (non-protein coding) | 283120 | This gene is located in an imprinted region of chromosome 11 near the insulin-like growth factor 2 (IGF2) gene. This gene is only expressed from the maternally-inherited chromosome, whereas IGF2 is only expressed from the paternally-inherited chromosome. The product of this gene is a long non-coding RNA which functions as a tumor suppressor. Mutations in this gene have been associated with Beckwith-Wiedemann Syndrome and Wilms tumorigenesis. Alternative splicing results in multiple transcript variants. | H19 | ENSG00000130600 |
| ribosomal protein L3 | 6122 | Ribosomes, the complexes that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L3P family of ribosomal proteins and it is located in the cytoplasm. The protein can bind to the HIV-1 TAR mRNA, and it has been suggested that the protein contributes to tat-mediated transactivation. This gene is co-transcribed with several small nucleolar RNA genes, which are located in several of this gene’s introns. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | RPL3 | ENSG00000100316 |
| CD248 molecule | 57124 | NA | CD248 | ENSG00000174807 |
| retinol binding protein 4 | 5950 | This protein belongs to the lipocalin family and is the specific carrier for retinol (vitamin A alcohol) in the blood. It delivers retinol from the liver stores to the peripheral tissues. In plasma, the RBP-retinol complex interacts with transthyretin which prevents its loss by filtration through the kidney glomeruli. A deficiency of vitamin A blocks secretion of the binding protein posttranslationally and results in defective delivery and supply to the epidermal cells. | RBP4 | ENSG00000138207 |
| transglutaminase 3 | 7053 | Transglutaminases are enzymes that catalyze the crosslinking of proteins by epsilon-gamma glutamyl lysine isopeptide bonds. While the primary structure of transglutaminases is not conserved, they all have the same amino acid sequence at their active sites and their activity is calcium-dependent. The protein encoded by this gene consists of two polypeptide chains activated from a single precursor protein by proteolysis. The encoded protein is involved the later stages of cell envelope formation in the epidermis and hair follicle. | TGM3 | ENSG00000125780 |
| actinin alpha 2 | 88 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a muscle-specific, alpha actinin isoform that is expressed in both skeletal and cardiac muscles. Several transcript variants encoding different isoforms have been found for this gene. | ACTN2 | ENSG00000077522 |
| endothelial PAS domain protein 1 | 2034 | This gene encodes a transcription factor involved in the induction of genes regulated by oxygen, which is induced as oxygen levels fall. The encoded protein contains a basic-helix-loop-helix domain protein dimerization domain as well as a domain found in proteins in signal transduction pathways which respond to oxygen levels. Mutations in this gene are associated with erythrocytosis familial type 4. | EPAS1 | ENSG00000116016 |
| elastin | 2006 | This gene encodes a protein that is one of the two components of elastic fibers. The encoded protein is rich in hydrophobic amino acids such as glycine and proline, which form mobile hydrophobic regions bounded by crosslinks between lysine residues. Deletions and mutations in this gene are associated with supravalvular aortic stenosis (SVAS) and autosomal dominant cutis laxa. Multiple transcript variants encoding different isoforms have been found for this gene. | ELN | ENSG00000049540 |
| solute carrier family 4 member 11 | 83959 | This gene encodes a voltage-regulated, electrogenic sodium-coupled borate cotransporter that is essential for borate homeostasis, cell growth and cell proliferation. Mutations in this gene have been associated with a number of endothelial corneal dystrophies including recessive corneal endothelial dystrophy 2, corneal dystrophy and perceptive deafness, and Fuchs endothelial corneal dystrophy. Multiple transcript variants encoding different isoforms have been described. | SLC4A11 | ENSG00000088836 |
| ST3 beta-galactoside alpha-2,3-sialyltransferase 1 | 6482 | The protein encoded by this gene is a type II membrane protein that catalyzes the transfer of sialic acid from CMP-sialic acid to galactose-containing substrates. The encoded protein is normally found in the Golgi but can be proteolytically processed to a soluble form. Correct glycosylation of the encoded protein may be critical to its sialyltransferase activity. This protein, which is a member of glycosyltransferase family 29, can use the same acceptor substrates as does sialyltransferase 4B. Two transcript variants encoding the same protein have been found for this gene. Other transcript variants may exist, but have not been fully characterized yet. | ST3GAL1 | ENSG00000008513 |
| tropomyosin 1 (alpha) | 7168 | This gene is a member of the tropomyosin family of highly conserved, widely distributed actin-binding proteins involved in the contractile system of striated and smooth muscles and the cytoskeleton of non-muscle cells. Tropomyosin is composed of two alpha-helical chains arranged as a coiled-coil. It is polymerized end to end along the two grooves of actin filaments and provides stability to the filaments. The encoded protein is one type of alpha helical chain that forms the predominant tropomyosin of striated muscle, where it also functions in association with the troponin complex to regulate the calcium-dependent interaction of actin and myosin during muscle contraction. In smooth muscle and non-muscle cells, alternatively spliced transcript variants encoding a range of isoforms have been described. Mutations in this gene are associated with type 3 familial hypertrophic cardiomyopathy. | TPM1 | ENSG00000140416 |
| actin, alpha 2, smooth muscle, aorta | 59 | The protein encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Defects in this gene cause aortic aneurysm familial thoracic type 6. Multiple alternatively spliced variants, encoding the same protein, have been identified. | ACTA2 | ENSG00000107796 |
| coronin 6 | 84940 | NA | CORO6 | ENSG00000167549 |
| keratin 13 | 3860 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | KRT13 | ENSG00000171401 |
| rhophilin, Rho GTPase binding protein 1 | 114822 | NA | RHPN1 | ENSG00000158106 |
| protease, serine 1 | 5644 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | PRSS1 | ENSG00000204983 |
| regenerating family member 1 alpha | 5967 | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | REG1A | ENSG00000115386 |
| dual oxidase 1 | 53905 | The protein encoded by this gene is a glycoprotein and a member of the NADPH oxidase family. The synthesis of thyroid hormone is catalyzed by a protein complex located at the apical membrane of thyroid follicular cells. This complex contains an iodide transporter, thyroperoxidase, and a peroxide generating system that includes proteins encoded by this gene and the similar DUOX2 gene. This protein is known as dual oxidase because it has both a peroxidase homology domain and a gp91phox domain. This protein generates hydrogen peroxide and thereby plays a role in the activity of thyroid peroxidase, lactoperoxidase, and in lactoperoxidase-mediated antimicrobial defense at mucosal surfaces. Two alternatively spliced transcript variants encoding the same protein have been described for this gene. | DUOX1 | ENSG00000137857 |
| actin, alpha, cardiac muscle 1 | 70 | Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC). | ACTC1 | ENSG00000159251 |
| albumin | 213 | Albumin is a soluble, monomeric protein which comprises about one-half of the blood serum protein. Albumin functions primarily as a carrier protein for steroids, fatty acids, and thyroid hormones and plays a role in stabilizing extracellular fluid volume. Albumin is a globular unglycosylated serum protein of molecular weight 65,000. Albumin is synthesized in the liver as preproalbumin which has an N-terminal peptide that is removed before the nascent protein is released from the rough endoplasmic reticulum. The product, proalbumin, is in turn cleaved in the Golgi vesicles to produce the secreted albumin. | ALB | ENSG00000163631 |
| S100 calcium binding protein B | 6285 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21; however, this gene is located at 21q22.3. This protein may function in Neurite extension, proliferation of melanoma cells, stimulation of Ca2+ fluxes, inhibition of PKC-mediated phosphorylation, astrocytosis and axonal proliferation, and inhibition of microtubule assembly. Chromosomal rearrangements and altered expression of this gene have been implicated in several neurological, neoplastic, and other types of diseases, including Alzheimer’s disease, Down’s syndrome, epilepsy, amyotrophic lateral sclerosis, melanoma, and type I diabetes. | S100B | ENSG00000160307 |
| ribosomal protein S8 | 6202 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S8E family of ribosomal proteins. It is located in the cytoplasm. Increased expression of this gene in colorectal tumors and colon polyps compared to matched normal colonic mucosa has been observed. This gene is co-transcribed with the small nucleolar RNA genes U38A, U38B, U39, and U40, which are located in its fourth, fifth, first, and second introns, respectively. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | RPS8 | ENSG00000142937 |
| pleckstrin homology, MyTH4 and FERM domain containing H1 | 57475 | NA | PLEKHH1 | ENSG00000054690 |
| chymotrypsin like elastase family member 3A | 10136 | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. | CELA3A | ENSG00000142789 |
| glutathione peroxidase 3 | 2878 | This gene product belongs to the glutathione peroxidase family, which functions in the detoxification of hydrogen peroxide. It contains a selenocysteine (Sec) residue at its active site. The selenocysteine is encoded by the UGA codon, which normally signals translation termination. The 3’ UTR of Sec-containing genes have a common stem-loop structure, the sec insertion sequence (SECIS), which is necessary for the recognition of UGA as a Sec codon rather than as a stop signal. | GPX3 | ENSG00000211445 |
| ribosomal protein S12 | 6206 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S12E family of ribosomal proteins. It is located in the cytoplasm. Increased expression of this gene in colorectal cancers compared to matched normal colonic mucosa has been observed. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | RPS12 | ENSG00000112306 |
| ACTA2 antisense RNA 1 | ENSG00000180139 | NA | ACTA2-AS1 | ENSG00000180139 |
| vascular endothelial growth factor A | 7422 | This gene is a member of the PDGF/VEGF growth factor family. It encodes a heparin-binding protein, which exists as a disulfide-linked homodimer. This growth factor induces proliferation and migration of vascular endothelial cells, and is essential for both physiological and pathological angiogenesis. Disruption of this gene in mice resulted in abnormal embryonic blood vessel formation. This gene is upregulated in many known tumors and its expression is correlated with tumor stage and progression. Elevated levels of this protein are found in patients with POEMS syndrome, also known as Crow-Fukase syndrome. Allelic variants of this gene have been associated with microvascular complications of diabetes 1 (MVCD1) and atherosclerosis. Alternatively spliced transcript variants encoding different isoforms have been described. There is also evidence for alternative translation initiation from upstream non-AUG (CUG) codons resulting in additional isoforms. A recent study showed that a C-terminally extended isoform is produced by use of an alternative in-frame translation termination codon via a stop codon readthrough mechanism, and that this isoform is antiangiogenic. Expression of some isoforms derived from the AUG start codon is regulated by a small upstream open reading frame, which is located within an internal ribosome entry site. | VEGFA | ENSG00000112715 |
| basal cell adhesion molecule (Lutheran blood group) | 4059 | This gene encodes Lutheran blood group glycoprotein, a member of the immunoglobulin superfamily and a receptor for the extracellular matrix protein, laminin. The protein contains five extracellular immunoglobulin domains, a single transmembrane domain, and a short C-terminal cytoplasmic tail. This protein may play a role in epithelial cell cancer and in vaso-occlusion of red blood cells in sickle cell disease. Polymorphisms in this gene define some of the antigens in the Lutheran system and also the Auberger system. Inactivating variants of this gene result in the recessive Lutheran null phenotype, Lu(a-b-), of the Lutheran blood group. Two transcript variants encoding different isoforms have been found for this gene. | BCAM | ENSG00000187244 |
| desmin | 1674 | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | DES | ENSG00000175084 |
| nebulin related anchoring protein | 4892 | NA | NRAP | ENSG00000197893 |
| LY6/PLAUR domain containing 3 | 27076 | NA | LYPD3 | ENSG00000124466 |
| CD59 molecule | 966 | This gene encodes a cell surface glycoprotein that regulates complement-mediated cell lysis, and it is involved in lymphocyte signal transduction. This protein is a potent inhibitor of the complement membrane attack complex, whereby it binds complement C8 and/or C9 during the assembly of this complex, thereby inhibiting the incorporation of multiple copies of C9 into the complex, which is necessary for osmolytic pore formation. This protein also plays a role in signal transduction pathways in the activation of T cells. Mutations in this gene cause CD59 deficiency, a disease resulting in hemolytic anemia and thrombosis, and which causes cerebral infarction. Multiple alternatively spliced transcript variants, which encode the same protein, have been identified for this gene. | CD59 | ENSG00000085063 |
| serum amyloid A1 | 6288 | This gene encodes a member of the serum amyloid A family of apolipoproteins. The encoded preproprotein is proteolytically processed to generate the mature protein. This protein is a major acute phase protein that is highly expressed in response to inflammation and tissue injury. This protein also plays an important role in HDL metabolism and cholesterol homeostasis. High levels of this protein are associated with chronic inflammatory diseases including atherosclerosis, rheumatoid arthritis, Alzheimer’s disease and Crohn’s disease. This protein may also be a potential biomarker for certain tumors. Alternate splicing results in multiple transcript variants that encode the same protein. A pseudogene of this gene is found on chromosome 11. | SAA1 | ENSG00000173432 |
| myosin phosphatase Rho interacting protein | 23164 | NA | MPRIP | ENSG00000133030 |
| inhibitor of DNA binding 3, HLH protein | 3399 | The protein encoded by this gene is a helix-loop-helix (HLH) protein that can form heterodimers with other HLH proteins. However, the encoded protein lacks a basic DNA-binding domain and therefore inhibits the DNA binding of any HLH protein with which it interacts. | ID3 | ENSG00000117318 |
| ubiquitin specific peptidase 54 | 159195 | NA | USP54 | ENSG00000166348 |
| transcription factor CP2-like 1 | 29842 | NA | TFCP2L1 | ENSG00000115112 |
| CAP-Gly domain containing linker protein 3 | 25999 | This gene encodes a member of the cytoplasmic linker protein 170 family. Members of this protein family contain a cytoskeleton-associated protein glycine-rich domain and mediate the interaction of microtubules with cellular organelles. The encoded protein plays a role in T cell apoptosis by facilitating the association of tubulin and the lipid raft ganglioside GD3. The encoded protein also functions as a scaffold protein mediating membrane localization of phosphorylated protein kinase B. Alternatively spliced transcript variants have been observed for this gene. | CLIP3 | ENSG00000105270 |
| ribosomal protein lateral stalk subunit P2 | 6181 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal phosphoprotein that is a component of the 60S subunit. The protein, which is a functional equivalent of the E. coli L7/L12 ribosomal protein, belongs to the L12P family of ribosomal proteins. It plays an important role in the elongation step of protein synthesis. Unlike most ribosomal proteins, which are basic, the encoded protein is acidic. Its C-terminal end is nearly identical to the C-terminal ends of the ribosomal phosphoproteins P0 and P1. The P2 protein can interact with P0 and P1 to form a pentameric complex consisting of P1 and P2 dimers, and a P0 monomer. The protein is located in the cytoplasm. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | RPLP2 | ENSG00000177600 |
| TEA domain transcription factor 4 | 7004 | This gene product is a member of the transcriptional enhancer factor (TEF) family of transcription factors, which contain the TEA/ATTS DNA-binding domain. It is preferentially expressed in the skeletal muscle, and binds to the M-CAT regulatory element found in promoters of muscle-specific genes to direct their gene expression. Alternatively spliced transcripts encoding distinct isoforms, some of which are translated through the use of a non-AUG (UUG) initiation codon, have been described for this gene. | TEAD4 | ENSG00000197905 |
| amine oxidase, copper containing 3 | 8639 | This gene encodes a member of the semicarbazide-sensitive amine oxidase family. Copper amine oxidases catalyze the oxidative conversion of amines to aldehydes in the presence of copper and quinone cofactor. The encoded protein is localized to the cell surface, has adhesive properties as well as monoamine oxidase activity, and may be involved in leukocyte trafficking. Alterations in levels of the encoded protein may be associated with many diseases, including diabetes mellitus. A pseudogene of this gene has been described and is located approximately 9-kb downstream on the same chromosome. Alternative splicing results in multiple transcript variants. | AOC3 | ENSG00000131471 |
| calreticulin | 811 | Calreticulin is a multifunctional protein that acts as a major Ca(2+)-binding (storage) protein in the lumen of the endoplasmic reticulum. It is also found in the nucleus, suggesting that it may have a role in transcription regulation. Calreticulin binds to the synthetic peptide KLGFFKR, which is almost identical to an amino acid sequence in the DNA-binding domain of the superfamily of nuclear receptors. Calreticulin binds to antibodies in certain sera of systemic lupus and Sjogren patients which contain anti-Ro/SSA antibodies, it is highly conserved among species, and it is located in the endoplasmic and sarcoplasmic reticulum where it may bind calcium. The amino terminus of calreticulin interacts with the DNA-binding domain of the glucocorticoid receptor and prevents the receptor from binding to its specific glucocorticoid response element. Calreticulin can inhibit the binding of androgen receptor to its hormone-responsive DNA element and can inhibit androgen receptor and retinoic acid receptor transcriptional activities in vivo, as well as retinoic acid-induced neuronal differentiation. Thus, calreticulin can act as an important modulator of the regulation of gene transcription by nuclear hormone receptors. Systemic lupus erythematosus is associated with increased autoantibody titers against calreticulin but calreticulin is not a Ro/SS-A antigen. Earlier papers referred to calreticulin as an Ro/SS-A antigen but this was later disproven. Increased autoantibody titer against human calreticulin is found in infants with complete congenital heart block of both the IgG and IgM classes. | CALR | ENSG00000179218 |
| death associated protein kinase 2 | 23604 | This gene encodes a protein that belongs to the serine/threonine protein kinase family. This protein contains a N-terminal protein kinase domain followed by a conserved calmodulin-binding domain with significant similarity to that of death-associated protein kinase 1 (DAPK1), a positive regulator of programmed cell death. Overexpression of this gene was shown to induce cell apoptosis. It uses multiple polyadenylation sites. | DAPK2 | ENSG00000035664 |
| SKI-like proto-oncogene | 6498 | The protein encoded by this gene is a component of the SMAD pathway, which regulates cell growth and differentiation through transforming growth factor-beta (TGFB). In the absence of ligand, the encoded protein binds to the promoter region of TGFB-responsive genes and recruits a nuclear repressor complex. TGFB signaling causes SMAD3 to enter the nucleus and degrade this protein, allowing these genes to be activated. Four transcript variants encoding three different isoforms have been found for this gene. | SKIL | ENSG00000136603 |
| myosin binding protein C, slow type | 4604 | This gene encodes a member of the myosin-binding protein C family. Myosin-binding protein C family members are myosin-associated proteins found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The encoded protein is the slow skeletal muscle isoform of myosin-binding protein C and plays an important role in muscle contraction by recruiting muscle-type creatine kinase to myosin filaments. Mutations in this gene are associated with distal arthrogryposis type I. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | MYBPC1 | ENSG00000196091 |
| keratin 4 | 3851 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | KRT4 | ENSG00000170477 |
| immunoglobulin heavy constant gamma 1 (G1m marker) | ENSG00000211896 | NA | IGHG1 | ENSG00000211896 |
| frizzled class receptor 1 | 8321 | Members of the ‘frizzled’ gene family encode 7-transmembrane domain proteins that are receptors for Wnt signaling proteins. The FZD1 protein contains a signal peptide, a cysteine-rich domain in the N-terminal extracellular region, 7 transmembrane domains, and a C-terminal PDZ domain-binding motif. The FZD1 transcript is expressed in various tissues. | FZD1 | ENSG00000157240 |
| epoxide hydrolase 1 | 2052 | Epoxide hydrolase is a critical biotransformation enzyme that converts epoxides from the degradation of aromatic compounds to trans-dihydrodiols which can be conjugated and excreted from the body. Epoxide hydrolase functions in both the activation and detoxification of epoxides. Mutations in this gene cause preeclampsia, epoxide hydrolase deficiency or increased epoxide hydrolase activity. Alternatively spliced transcript variants encoding the same protein have been found for this gene. | EPHX1 | ENSG00000143819 |
| activated leukocyte cell adhesion molecule | 214 | This gene encodes activated leukocyte cell adhesion molecule (ALCAM), also known as CD166 (cluster of differentiation 166), which is a member of a subfamily of immunoglobulin receptors with five immunoglobulin-like domains (VVC2C2C2) in the extracellular domain. This protein binds to T-cell differentiation antigene CD6, and is implicated in the processes of cell adhesion and migration. Multiple alternatively spliced transcript variants encoding different isoforms have been found. | ALCAM | ENSG00000170017 |
| solute carrier family 25 member 29 | 123096 | This gene encodes a nuclear-encoded mitochondrial protein that is a member of the large family of solute carrier family 25 (SLC25) mitochondrial transporters. The members of this superfamily are involved in numerous metabolic pathways and cell functions. This gene product was previously reported to be a mitochondrial carnitine-acylcarnitine-like (CACL) translocase (PMID:128829710) or an ornithine transporter (designated ORNT3, PMID:19287344), however, a recent study characterized the main role of this protein as a mitochondrial transporter of basic amino acids, with a preference for arginine and lysine (PMID:24652292). Alternatively spliced transcript variants have been found for this gene. | SLC25A29 | ENSG00000197119 |
| transglutaminase 2 | 7052 | Transglutaminases are enzymes that catalyze the crosslinking of proteins by epsilon-gamma glutamyl lysine isopeptide bonds. While the primary structure of transglutaminases is not conserved, they all have the same amino acid sequence at their active sites and their activity is calcium-dependent. The protein encoded by this gene acts as a monomer, is induced by retinoic acid, and appears to be involved in apoptosis. Finally, the encoded protein is the autoantigen implicated in celiac disease. Two transcript variants encoding different isoforms have been found for this gene. | TGM2 | ENSG00000198959 |
| cysteine rich angiogenic inducer 61 | 3491 | The secreted protein encoded by this gene is growth factor-inducible and promotes the adhesion of endothelial cells. The encoded protein interacts with several integrins and with heparan sulfate proteoglycan. This protein also plays a role in cell proliferation, differentiation, angiogenesis, apoptosis, and extracellular matrix formation. | CYR61 | ENSG00000142871 |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",15,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[16,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | query | summary | name | X_id | notfound |
|---|---|---|---|---|---|
| MYH7 | ENSG00000092054 | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | myosin, heavy chain 7, cardiac muscle, beta | 4625 | NA |
| IGHA1 | ENSG00000211895 | NA | immunoglobulin heavy constant alpha 1 | ENSG00000211895 | NA |
| MB | ENSG00000198125 | This gene encodes a member of the globin superfamily and is expressed in skeletal and cardiac muscles. The encoded protein is a haemoprotein contributing to intracellular oxygen storage and transcellular facilitated diffusion of oxygen. At least three alternatively spliced transcript variants encoding the same protein have been reported. | myoglobin | 4151 | NA |
| PIGR | ENSG00000162896 | This gene is a member of the immunoglobulin superfamily. The encoded poly-Ig receptor binds polymeric immunoglobulin molecules at the basolateral surface of epithelial cells; the complex is then transported across the cell to be secreted at the apical surface. A significant association was found between immunoglobulin A nephropathy and several SNPs in this gene. | polymeric immunoglobulin receptor | 5284 | NA |
| IGHA2 | ENSG00000211890 | NA | immunoglobulin heavy constant alpha 2 (A2m marker) | ENSG00000211890 | NA |
| SFTPB | ENSG00000168878 | This gene encodes the pulmonary-associated surfactant protein B (SPB), an amphipathic surfactant protein essential for lung function and homeostasis after birth. Pulmonary surfactant is a surface-active lipoprotein complex composed of 90% lipids and 10% proteins which include plasma proteins and apolipoproteins SPA, SPB, SPC and SPD. The surfactant is secreted by the alveolar cells of the lung and maintains the stability of pulmonary tissue by reducing the surface tension of fluids that coat the lung. The SPB enhances the rate of spreading and increases the stability of surfactant monolayers in vitro. Multiple mutations in this gene have been identified, which cause pulmonary surfactant metabolism dysfunction type 1, also called pulmonary alveolar proteinosis due to surfactant protein B deficiency, and are associated with fatal respiratory distress in the neonatal period. Alternatively spliced transcript variants encoding the same protein have been identified. | surfactant protein B | 6439 | NA |
| KRT19 | ENSG00000171345 | The protein encoded by this gene is a member of the keratin family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. The type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. Unlike its related family members, this smallest known acidic cytokeratin is not paired with a basic cytokeratin in epithelial cells. It is specifically expressed in the periderm, the transiently superficial layer that envelopes the developing epidermis. The type I cytokeratins are clustered in a region of chromosome 17q12-q21. | keratin 19 | 3880 | NA |
| SFTPA2 | ENSG00000185303 | This gene is one of several genes encoding pulmonary-surfactant associated proteins (SFTPA) located on chromosome 10. Mutations in this gene and a highly similar gene located nearby, which affect the highly conserved carbohydrate recognition domain, are associated with idiopathic pulmonary fibrosis. The current version of the assembly displays only a single centromeric SFTPA gene pair rather than the two gene pairs shown in the previous assembly which were thought to have resulted from a duplication. | surfactant protein A2 | 729238 | NA |
| C1QB | ENSG00000173369 | This gene encodes a major constituent of the human complement subcomponent C1q. C1q associates with C1r and C1s in order to yield the first component of the serum complement system. Deficiency of C1q has been associated with lupus erythematosus and glomerulonephritis. C1q is composed of 18 polypeptide chains: six A-chains, six B-chains, and six C-chains. Each chain contains a collagen-like region located near the N terminus and a C-terminal globular region. The A-, B-, and C-chains are arranged in the order A-C-B on chromosome 1. This gene encodes the B-chain polypeptide of human complement subcomponent C1q | complement component 1, q subcomponent, B chain | 713 | NA |
| RNASE1 | ENSG00000129538 | This gene encodes a member of the pancreatic-type of secretory ribonucleases, a subset of the ribonuclease A superfamily. The encoded endonuclease cleaves internal phosphodiester RNA bonds on the 3’-side of pyrimidine bases. It prefers poly(C) as a substrate and hydrolyzes 2’,3’-cyclic nucleotides, with a pH optimum near 8.0. The encoded protein is monomeric and more commonly acts to degrade ds-RNA over ss-RNA. Alternative splicing occurs at this locus and four transcript variants encoding the same protein have been identified. | ribonuclease A family member 1, pancreatic | 6035 | NA |
| MYL2 | ENSG00000111245 | Thus gene encodes the regulatory light chain associated with cardiac myosin beta (or slow) heavy chain. Ca+ triggers the phosphorylation of regulatory light chain that in turn triggers contraction. Mutations in this gene are associated with mid-left ventricular chamber type hypertrophic cardiomyopathy. | myosin light chain 2 | 4633 | NA |
| TCAP | ENSG00000173991 | Sarcomere assembly is regulated by the muscle protein titin. Titin is a giant elastic protein with kinase activity that extends half the length of a sarcomere. It serves as a scaffold to which myofibrils and other muscle related proteins are attached. This gene encodes a protein found in striated and cardiac muscle that binds to the titin Z1-Z2 domains and is a substrate of titin kinase, interactions thought to be critical to sarcomere assembly. Mutations in this gene are associated with limb-girdle muscular dystrophy type 2G. | titin-cap | 8557 | NA |
| C1QC | ENSG00000159189 | This gene encodes a major constituent of the human complement subcomponent C1q. C1q associates with C1r and C1s in order to yield the first component of the serum complement system. A deficiency in C1q has been associated with lupus erythematosus and glomerulonephritis. C1q is composed of 18 polypeptide chains: six A-chains, six B-chains, and six C-chains. Each chain contains a collagen-like region located near the N-terminus, and a C-terminal globular region. The A-, B-, and C-chains are arranged in the order A-C-B on chromosome 1. This gene encodes the C-chain polypeptide of human complement subcomponent C1q. Alternatively spliced transcript variants that encode the same protein have been found for this gene. | complement component 1, q subcomponent, C chain | 714 | NA |
| SFTPA1 | ENSG00000122852 | This gene encodes a lung surfactant protein that is a member of a subfamily of C-type lectins called collectins. The encoded protein binds specific carbohydrate moieties found on lipids and on the surface of microorganisms. This protein plays an essential role in surfactant homeostasis and in the defense against respiratory pathogens. Mutations in this gene are associated with idiopathic pulmonary fibrosis. Alternate splicing results in multiple transcript variants. | surfactant protein A1 | 653509 | NA |
| SCD | ENSG00000099194 | This gene encodes an enzyme involved in fatty acid biosynthesis, primarily the synthesis of oleic acid. The protein belongs to the fatty acid desaturase family and is an integral membrane protein located in the endoplasmic reticulum. Transcripts of approximately 3.9 and 5.2 kb, differing only by alternative polyadenlyation signals, have been detected. A gene encoding a similar enzyme is located on chromosome 4 and a pseudogene of this gene is located on chromosome 17. | stearoyl-CoA desaturase | 6319 | NA |
| SFTPC | ENSG00000168484 | This gene encodes the pulmonary-associated surfactant protein C (SPC), an extremely hydrophobic surfactant protein essential for lung function and homeostasis after birth. Pulmonary surfactant is a surface-active lipoprotein complex composed of 90% lipids and 10% proteins which include plasma proteins and apolipoproteins SPA, SPB, SPC and SPD. The surfactant is secreted by the alveolar cells of the lung and maintains the stability of pulmonary tissue by reducing the surface tension of fluids that coat the lung. Multiple mutations in this gene have been identified, which cause pulmonary surfactant metabolism dysfunction type 2, also called pulmonary alveolar proteinosis due to surfactant protein C deficiency, and are associated with interstitial lung disease in older infants, children, and adults. Alternatively spliced transcript variants encoding different protein isoforms have been identified. | surfactant protein C | 6440 | NA |
| RPL13 | ENSG00000167526 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L13E family of ribosomal proteins. It is located in the cytoplasm. This gene is expressed at significantly higher levels in benign breast lesions than in breast carcinomas. Alternatively spliced transcript variants encoding distinct isoforms have been found for this gene. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | ribosomal protein L13 | 6137 | NA |
| KRT10 | ENSG00000186395 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | keratin 10 | 3858 | NA |
| LGALS4 | ENSG00000171747 | The galectins are a family of beta-galactoside-binding proteins implicated in modulating cell-cell and cell-matrix interactions. The expression of this gene is restricted to small intestine, colon, and rectum, and it is underexpressed in colorectal cancer. | galectin 4 | 3960 | NA |
| CA2 | ENSG00000104267 | The protein encoded by this gene is one of several isozymes of carbonic anhydrase, which catalyzes reversible hydration of carbon dioxide. Defects in this enzyme are associated with osteopetrosis and renal tubular acidosis. Two transcript variants encoding different isoforms have been found for this gene. | carbonic anhydrase 2 | 760 | NA |
| PTPRF | ENSG00000142949 | The protein encoded by this gene is a member of the protein tyrosine phosphatase (PTP) family. PTPs are known to be signaling molecules that regulate a variety of cellular processes including cell growth, differentiation, mitotic cycle, and oncogenic transformation. This PTP possesses an extracellular region, a single transmembrane region, and two tandem intracytoplasmic catalytic domains, and thus represents a receptor-type PTP. The extracellular region contains three Ig-like domains, and nine non-Ig like domains similar to that of neural-cell adhesion molecule. This PTP was shown to function in the regulation of epithelial cell-cell contacts at adherents junctions, as well as in the control of beta-catenin signaling. An increased expression level of this protein was found in the insulin-responsive tissue of obese, insulin-resistant individuals, and may contribute to the pathogenesis of insulin resistance. Two alternatively spliced transcript variants of this gene, which encode distinct proteins, have been reported. | protein tyrosine phosphatase, receptor type F | 5792 | NA |
| CKM | ENSG00000104879 | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis and is an important serum marker for myocardial infarction. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in striated muscle as well as in other tissues, and as a heterodimer with a similar brain isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. | creatine kinase, M-type | 1158 | NA |
| C3 | ENSG00000125730 | Complement component C3 plays a central role in the activation of complement system. Its activation is required for both classical and alternative complement activation pathways. The encoded preproprotein is proteolytically processed to generate alpha and beta subunits that form the mature protein, which is then further processed to generate numerous peptide products. The C3a peptide, also known as the C3a anaphylatoxin, modulates inflammation and possesses antimicrobial activity. Mutations in this gene are associated with atypical hemolytic uremic syndrome and age-related macular degeneration in human patients. | complement component 3 | 718 | NA |
| CD163 | ENSG00000177575 | The protein encoded by this gene is a member of the scavenger receptor cysteine-rich (SRCR) superfamily, and is exclusively expressed in monocytes and macrophages. It functions as an acute phase-regulated receptor involved in the clearance and endocytosis of hemoglobin/haptoglobin complexes by macrophages, and may thereby protect tissues from free hemoglobin-mediated oxidative damage. This protein may also function as an innate immune sensor for bacteria and inducer of local inflammation. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | CD163 molecule | 9332 | NA |
| COL1A1 | ENSG00000108821 | This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIA, Ehlers-Danlos syndrome Classical type, Caffey Disease and idiopathic osteoporosis. Reciprocal translocations between chromosomes 17 and 22, where this gene and the gene for platelet-derived growth factor beta are located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans, resulting from unregulated expression of the growth factor. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | collagen type I alpha 1 | 1277 | NA |
| SELENBP1 | ENSG00000143416 | This gene encodes a member of the selenium-binding protein family. Selenium is an essential nutrient that exhibits potent anticarcinogenic properties, and deficiency of selenium may cause certain neurologic diseases. The effects of selenium in preventing cancer and neurologic diseases may be mediated by selenium-binding proteins, and decreased expression of this gene may be associated with several types of cancer. The encoded protein may play a selenium-dependent role in ubiquitination/deubiquitination-mediated protein degradation. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | selenium binding protein 1 | 8991 | NA |
| YBX3 | ENSG00000060138 | NA | Y-box binding protein 3 | 8531 | NA |
| C1QA | ENSG00000173372 | This gene encodes a major constituent of the human complement subcomponent C1q. C1q associates with C1r and C1s in order to yield the first component of the serum complement system. Deficiency of C1q has been associated with lupus erythematosus and glomerulonephritis. C1q is composed of 18 polypeptide chains: six A-chains, six B-chains, and six C-chains. Each chain contains a collagen-like region located near the N terminus and a C-terminal globular region. The A-, B-, and C-chains are arranged in the order A-C-B on chromosome 1. This gene encodes the A-chain polypeptide of human complement subcomponent C1q. | complement component 1, q subcomponent, A chain | 712 | NA |
| TPM2 | ENSG00000198467 | This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | tropomyosin 2 (beta) | 7169 | NA |
| CD44 | ENSG00000026508 | The protein encoded by this gene is a cell-surface glycoprotein involved in cell-cell interactions, cell adhesion and migration. It is a receptor for hyaluronic acid (HA) and can also interact with other ligands, such as osteopontin, collagens, and matrix metalloproteinases (MMPs). This protein participates in a wide variety of cellular functions including lymphocyte activation, recirculation and homing, hematopoiesis, and tumor metastasis. Transcripts for this gene undergo complex alternative splicing that results in many functionally distinct isoforms, however, the full length nature of some of these variants has not been determined. Alternative splicing is the basis for the structural and functional diversity of this protein, and may be related to tumor metastasis. | CD44 molecule (Indian blood group) | 960 | NA |
| FLNC | ENSG00000128591 | This gene encodes one of three related filamin genes, specifically gamma filamin. These filamin proteins crosslink actin filaments into orthogonal networks in cortical cytoplasm and participate in the anchoring of membrane proteins for the actin cytoskeleton. Three functional domains exist in filamin: an N-terminal filamentous actin-binding domain, a C-terminal self-association domain, and a membrane glycoprotein-binding domain. Two transcript variants encoding different isoforms have been found for this gene. | filamin C | 2318 | NA |
| NA | ENSG00000090920 | NA | NA | NA | TRUE |
| RPS18 | ENSG00000231500 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S13P family of ribosomal proteins. It is located in the cytoplasm. The gene product of the E. coli ortholog (ribosomal protein S13) is involved in the binding of fMet-tRNA, and thus, in the initiation of translation. This gene is an ortholog of mouse Ke3. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | ribosomal protein S18 | 6222 | NA |
| ACSL5 | ENSG00000197142 | The protein encoded by this gene is an isozyme of the long-chain fatty-acid-coenzyme A ligase family. Although differing in substrate specificity, subcellular localization, and tissue distribution, all isozymes of this family convert free long-chain fatty acids into fatty acyl-CoA esters, and thereby play a key role in lipid biosynthesis and fatty acid degradation. This isozyme is highly expressed in uterus and spleen, and in trace amounts in normal brain, but has markedly increased levels in malignant gliomas. This gene functions in mediating fatty acid-induced glioma cell growth. Three transcript variants encoding two different isoforms have been found for this gene. | acyl-CoA synthetase long-chain family member 5 | 51703 | NA |
| COL1A2 | ENSG00000164692 | This gene encodes the pro-alpha2 chain of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIB, recessive Ehlers-Danlos syndrome Classical type, idiopathic osteoporosis, and atypical Marfan syndrome. Symptoms associated with mutations in this gene, however, tend to be less severe than mutations in the gene for the alpha1 chain of type I collagen (COL1A1) reflecting the different role of alpha2 chains in matrix integrity. Three transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | collagen type I alpha 2 chain | 1278 | NA |
| TPM1 | ENSG00000140416 | This gene is a member of the tropomyosin family of highly conserved, widely distributed actin-binding proteins involved in the contractile system of striated and smooth muscles and the cytoskeleton of non-muscle cells. Tropomyosin is composed of two alpha-helical chains arranged as a coiled-coil. It is polymerized end to end along the two grooves of actin filaments and provides stability to the filaments. The encoded protein is one type of alpha helical chain that forms the predominant tropomyosin of striated muscle, where it also functions in association with the troponin complex to regulate the calcium-dependent interaction of actin and myosin during muscle contraction. In smooth muscle and non-muscle cells, alternatively spliced transcript variants encoding a range of isoforms have been described. Mutations in this gene are associated with type 3 familial hypertrophic cardiomyopathy. | tropomyosin 1 (alpha) | 7168 | NA |
| SERPINA1 | ENSG00000197249 | The protein encoded by this gene is secreted and is a serine protease inhibitor whose targets include elastase, plasmin, thrombin, trypsin, chymotrypsin, and plasminogen activator. Defects in this gene can cause emphysema or liver disease. Several transcript variants encoding the same protein have been found for this gene. | serpin family A member 1 | 5265 | NA |
| KRT1 | ENSG00000167768 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | keratin 1 | 3848 | NA |
| TNNC1 | ENSG00000114854 | Troponin is a central regulatory protein of striated muscle contraction, and together with tropomyosin, is located on the actin filament. Troponin consists of 3 subunits: TnI, which is the inhibitor of actomyosin ATPase; TnT, which contains the binding site for tropomyosin; and TnC, the protein encoded by this gene. The binding of calcium to TnC abolishes the inhibitory action of TnI, thus allowing the interaction of actin with myosin, the hydrolysis of ATP, and the generation of tension. Mutations in this gene are associated with cardiomyopathy dilated type 1Z. | troponin C1, slow skeletal and cardiac type | 7134 | NA |
| PCK2 | ENSG00000100889 | This gene encodes a mitochondrial enzyme that catalyzes the conversion of oxaloacetate to phosphoenolpyruvate in the presence of guanosine triphosphate (GTP). A cytosolic form of this protein is encoded by a different gene and is the key enzyme of gluconeogenesis in the liver. Alternatively spliced transcript variants have been described. | phosphoenolpyruvate carboxykinase 2, mitochondrial | 5106 | NA |
| MTCO1P12 | ENSG00000237973 | NA | MT-CO1 pseudogene 12 | ENSG00000237973 | NA |
| GAPDH | ENSG00000111640 | This gene encodes a member of the glyceraldehyde-3-phosphate dehydrogenase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. The product of this gene catalyzes an important energy-yielding step in carbohydrate metabolism, the reversible oxidative phosphorylation of glyceraldehyde-3-phosphate in the presence of inorganic phosphate and nicotinamide adenine dinucleotide (NAD). The encoded protein has additionally been identified to have uracil DNA glycosylase activity in the nucleus. Also, this protein contains a peptide that has antimicrobial activity against E. coli, P. aeruginosa, and C. albicans. Studies of a similar protein in mouse have assigned a variety of additional functions including nitrosylation of nuclear proteins, the regulation of mRNA stability, and acting as a transferrin receptor on the cell surface of macrophage. Many pseudogenes similar to this locus are present in the human genome. Alternative splicing results in multiple transcript variants. | glyceraldehyde-3-phosphate dehydrogenase | 2597 | NA |
| MYH11 | ENSG00000133392 | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | myosin, heavy chain 11, smooth muscle | 4629 | NA |
| PLS1 | ENSG00000120756 | Plastins are a family of actin-binding proteins that are conserved throughout eukaryote evolution and expressed in most tissues of higher eukaryotes. In humans, two ubiquitous plastin isoforms (L and T) have been identified. The protein encoded by this gene is a third distinct plastin isoform, which is specifically expressed at high levels in the small intestine. Alternatively spliced transcript variants varying in the 5’ UTR, but encoding the same protein, have been found for this gene. A pseudogene of this gene is found on chromosome 11. | plastin 1 | 5357 | NA |
| GPX2 | ENSG00000176153 | This gene is a member of the glutathione peroxidase family and encodes a selenium-dependent glutathione peroxidase that is one of two isoenzymes responsible for the majority of the glutathione-dependent hydrogen peroxide-reducing activity in the epithelium of the gastrointestinal tract. The protein encoded by this locus contains a selenocysteine (Sec) residue encoded by the UGA codon, which normally signals translation termination. Alternatively spliced transcript variants have been described. | glutathione peroxidase 2 | 2877 | NA |
| MUC1 | ENSG00000185499 | This gene encodes a membrane-bound protein that is a member of the mucin family. Mucins are O-glycosylated proteins that play an essential role in forming protective mucous barriers on epithelial surfaces. These proteins also play a role in intracellular signaling. This protein is expressed on the apical surface of epithelial cells that line the mucosal surfaces of many different tissues including lung, breast stomach and pancreas. This protein is proteolytically cleaved into alpha and beta subunits that form a heterodimeric complex. The N-terminal alpha subunit functions in cell-adhesion and the C-terminal beta subunit is involved in cell signaling. Overexpression, aberrant intracellular localization, and changes in glycosylation of this protein have been associated with carcinomas. This gene is known to contain a highly polymorphic variable number tandem repeats (VNTR) domain. Alternate splicing results in multiple transcript variants. | mucin 1, cell surface associated | 4582 | NA |
| LLGL2 | ENSG00000073350 | The lethal (2) giant larvae protein of Drosophila plays a role in asymmetric cell division, epithelial cell polarity, and cell migration. This human gene encodes a protein similar to lethal (2) giant larvae of Drosophila. In fly, the protein’s ability to localize cell fate determinants is regulated by the atypical protein kinase C (aPKC). In human, this protein interacts with aPKC-containing complexes and is cortically localized in mitotic cells. Alternative splicing results in multiple transcript variants encoding different isoforms. | LLGL2, scribble cell polarity complex component | 3993 | NA |
| CES2 | ENSG00000172831 | This gene encodes a member of the carboxylesterase large family. The family members are responsible for the hydrolysis or transesterification of various xenobiotics, such as cocaine and heroin, and endogenous substrates with ester, thioester, or amide bonds. They may participate in fatty acyl and cholesterol ester metabolism, and may play a role in the blood-brain barrier system. The protein encoded by this gene is the major intestinal enzyme and functions in intestine drug clearance. Alternatively spliced transcript variants have been found for this gene. | carboxylesterase 2 | 8824 | NA |
| RPS3 | ENSG00000149273 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit, where it forms part of the domain where translation is initiated. The protein belongs to the S3P family of ribosomal proteins. Studies of the mouse and rat proteins have demonstrated that the protein has an extraribosomal role as an endonuclease involved in the repair of UV-induced DNA damage. The protein appears to be located in both the cytoplasm and nucleus but not in the nucleolus. Higher levels of expression of this gene in colon adenocarcinomas and adenomatous polyps compared to adjacent normal colonic mucosa have been observed. This gene is co-transcribed with the small nucleolar RNA genes U15A and U15B, which are located in its first and fifth introns, respectively. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Multiple alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ribosomal protein S3 | 6188 | NA |
| C1orf115 | ENSG00000162817 | NA | chromosome 1 open reading frame 115 | 79762 | NA |
| RP11-510N19.5 | ENSG00000249007 | NA | NA | ENSG00000249007 | NA |
| FBLN1 | ENSG00000077942 | Fibulin 1 is a secreted glycoprotein that becomes incorporated into a fibrillar extracellular matrix. Calcium-binding is apparently required to mediate its binding to laminin and nidogen. It mediates platelet adhesion via binding fibrinogen. Four splice variants which differ in the 3’ end have been identified. Each variant encodes a different isoform, but no functional distinctions have been identified among the four variants. | fibulin 1 | 2192 | NA |
| CCDC80 | ENSG00000091986 | NA | coiled-coil domain containing 80 | 151887 | NA |
| ACTC1 | ENSG00000159251 | Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC). | actin, alpha, cardiac muscle 1 | 70 | NA |
| RBM47 | ENSG00000163694 | NA | RNA binding motif protein 47 | 54502 | NA |
| SLC26A2 | ENSG00000155850 | The diastrophic dysplasia sulfate transporter is a transmembrane glycoprotein implicated in the pathogenesis of several human chondrodysplasias. It apparently is critical in cartilage for sulfation of proteoglycans and matrix organization. | solute carrier family 26 member 2 | 1836 | NA |
| LSR | ENSG00000105699 | NA | lipolysis stimulated lipoprotein receptor | 51599 | NA |
| FZD5 | ENSG00000163251 | Members of the ‘frizzled’ gene family encode 7-transmembrane domain proteins that are receptors for Wnt signaling proteins. The FZD5 protein is believed to be the receptor for the Wnt5A ligand. | frizzled class receptor 5 | 7855 | NA |
| CNDP2 | ENSG00000133313 | CNDP2, also known as tissue carnosinase and peptidase A (EC 3.4.13.18), is a nonspecific dipeptidase rather than a selective carnosinase (Teufel et al., 2003 [PubMed 12473676]). | CNDP dipeptidase 2 (metallopeptidase M20 family) | 55748 | NA |
| KRT2 | ENSG00000172867 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | keratin 2 | 3849 | NA |
| RETSAT | ENSG00000042445 | NA | retinol saturase | 54884 | NA |
| RPS19 | ENSG00000105372 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S19E family of ribosomal proteins. It is located in the cytoplasm. Mutations in this gene cause Diamond-Blackfan anemia (DBA), a constitutional erythroblastopenia characterized by absent or decreased erythroid precursors, in a subset of patients. This suggests a possible extra-ribosomal function for this gene in erythropoietic differentiation and proliferation, in addition to its ribosomal function. Higher expression levels of this gene in some primary colon carcinomas compared to matched normal colon tissues has been observed. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | ribosomal protein S19 | 6223 | NA |
| ACTN2 | ENSG00000077522 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a muscle-specific, alpha actinin isoform that is expressed in both skeletal and cardiac muscles. Several transcript variants encoding different isoforms have been found for this gene. | actinin alpha 2 | 88 | NA |
| TTN | ENSG00000155657 | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | titin | 7273 | NA |
| CMYA5 | ENSG00000164309 | NA | cardiomyopathy associated 5 | 202333 | NA |
| MT1G | ENSG00000125144 | NA | metallothionein 1G | 4495 | NA |
| PDE4DIP | ENSG00000178104 | The protein encoded by this gene serves to anchor phosphodiesterase 4D to the Golgi/centrosome region of the cell. Defects in this gene may be a cause of myeloproliferative disorder (MBD) associated with eosinophilia. Several transcript variants encoding different isoforms have been found for this gene. | phosphodiesterase 4D interacting protein | 9659 | NA |
| RPS8 | ENSG00000142937 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 40S subunit. The protein belongs to the S8E family of ribosomal proteins. It is located in the cytoplasm. Increased expression of this gene in colorectal tumors and colon polyps compared to matched normal colonic mucosa has been observed. This gene is co-transcribed with the small nucleolar RNA genes U38A, U38B, U39, and U40, which are located in its fourth, fifth, first, and second introns, respectively. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | ribosomal protein S8 | 6202 | NA |
| CYP3A5 | ENSG00000106258 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. The encoded protein metabolizes drugs as well as the steroid hormones testosterone and progesterone. This gene is part of a cluster of cytochrome P450 genes on chromosome 7q21.1. Two pseudogenes of this gene have been identified within this cluster on chromosome 7. Expression of this gene is widely variable among populations, and a single nucleotide polymorphism that affects transcript splicing has been associated with susceptibility to hypertensions. Alternative splicing results in multiple transcript variants. | cytochrome P450 family 3 subfamily A member 5 | 1577 | NA |
| ABCC3 | ENSG00000108846 | The protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intra-cellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the MRP subfamily which is involved in multi-drug resistance. The specific function of this protein has not yet been determined; however, this protein may play a role in the transport of biliary and intestinal excretion of organic anions. Alternatively spliced variants which encode different protein isoforms have been described; however, not all variants have been fully characterized. | ATP binding cassette subfamily C member 3 | 8714 | NA |
| MRC2 | ENSG00000011028 | This gene encodes a member of the mannose receptor family of proteins that contain a fibronectin type II domain and multiple C-type lectin-like domains. The encoded protein plays a role in extracellular matrix remodeling by mediating the internalization and lysosomal degradation of collagen ligands. Expression of this gene may play a role in the tumorigenesis and metastasis of several malignancies including breast cancer, gliomas and metastatic bone disease. | mannose receptor C type 2 | 9902 | NA |
| TMEM37 | ENSG00000171227 | NA | transmembrane protein 37 | 140738 | NA |
| COX6A2 | ENSG00000156885 | Cytochrome c oxidase (COX), the terminal enzyme of the mitochondrial respiratory chain, catalyzes the electron transfer from reduced cytochrome c to oxygen. It is a heteromeric complex consisting of 3 catalytic subunits encoded by mitochondrial genes and multiple structural subunits encoded by nuclear genes. The mitochondrially-encoded subunits function in electron transfer, and the nuclear-encoded subunits may be involved in the regulation and assembly of the complex. This nuclear gene encodes polypeptide 2 (heart/muscle isoform) of subunit VIa, and polypeptide 2 is present only in striated muscles. Polypeptide 1 (liver isoform) of subunit VIa is encoded by a different gene, and is found in all non-muscle tissues. These two polypeptides share 66% amino acid sequence identity. | cytochrome c oxidase subunit 6A2 | 1339 | NA |
| COL7A1 | ENSG00000114270 | This gene encodes the alpha chain of type VII collagen. The type VII collagen fibril, composed of three identical alpha collagen chains, is restricted to the basement zone beneath stratified squamous epithelia. It functions as an anchoring fibril between the external epithelia and the underlying stroma. Mutations in this gene are associated with all forms of dystrophic epidermolysis bullosa. In the absence of mutations, however, an acquired form of this disease can result from an autoimmune response made to type VII collagen. | collagen type VII alpha 1 | 1294 | NA |
| GLUL | ENSG00000135821 | The protein encoded by this gene belongs to the glutamine synthetase family. It catalyzes the synthesis of glutamine from glutamate and ammonia in an ATP-dependent reaction. This protein plays a role in ammonia and glutamate detoxification, acid-base homeostasis, cell signaling, and cell proliferation. Glutamine is an abundant amino acid, and is important to the biosynthesis of several amino acids, pyrimidines, and purines. Mutations in this gene are associated with congenital glutamine deficiency, and overexpression of this gene was observed in some primary liver cancer samples. There are six pseudogenes of this gene found on chromosomes 2, 5, 9, 11, and 12. Alternative splicing results in multiple transcript variants. | glutamate-ammonia ligase | 2752 | NA |
| FMO5 | ENSG00000131781 | Metabolic N-oxidation of the diet-derived amino-trimethylamine (TMA) is mediated by flavin-containing monooxygenase and is subject to an inherited FMO3 polymorphism in man resulting in a small subpopulation with reduced TMA N-oxidation capacity resulting in fish odor syndrome Trimethylaminuria. Three forms of the enzyme, FMO1 found in fetal liver, FMO2 found in adult liver, and FMO3 are encoded by genes clustered in the 1q23-q25 region. Flavin-containing monooxygenases are NADPH-dependent flavoenzymes that catalyzes the oxidation of soft nucleophilic heteroatom centers in drugs, pesticides, and xenobiotics. Alternative splicing results in multiple transcript variants. | flavin containing monooxygenase 5 | 2330 | NA |
| EZR | ENSG00000092820 | The cytoplasmic peripheral membrane protein encoded by this gene functions as a protein-tyrosine kinase substrate in microvilli. As a member of the ERM protein family, this protein serves as an intermediate between the plasma membrane and the actin cytoskeleton. This protein plays a key role in cell surface structure adhesion, migration and organization, and it has been implicated in various human cancers. A pseudogene located on chromosome 3 has been identified for this gene. Alternatively spliced variants have also been described for this gene. | ezrin | 7430 | NA |
| COL6A2 | ENSG00000142173 | This gene encodes one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The product of this gene contains several domains similar to von Willebrand Factor type A domains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in this gene are associated with Bethlem myopathy and Ullrich scleroatonic muscular dystrophy. Three transcript variants have been identified for this gene. | collagen type VI alpha 2 | 1292 | NA |
| SGK1 | ENSG00000118515 | This gene encodes a serine/threonine protein kinase that plays an important role in cellular stress response. This kinase activates certain potassium, sodium, and chloride channels, suggesting an involvement in the regulation of processes such as cell survival, neuronal excitability, and renal sodium excretion. High levels of expression of this gene may contribute to conditions such as hypertension and diabetic nephropathy. Several alternatively spliced transcript variants encoding different isoforms have been noted for this gene. | serum/glucocorticoid regulated kinase 1 | 6446 | NA |
| RPL18 | ENSG00000063177 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a member of the L18E family of ribosomal proteins that is a component of the 60S subunit. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | ribosomal protein L18 | 6141 | NA |
| IGHM | ENSG00000211899 | NA | immunoglobulin heavy constant mu | ENSG00000211899 | NA |
| PBLD | ENSG00000108187 | NA | phenazine biosynthesis like protein domain containing | 64081 | NA |
| RPS11 | ENSG00000142534 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a member of the S17P family of ribosomal proteins that is a component of the 40S subunit. This gene is co-transcribed with the small nucleolar RNA gene U35B, which is located in the third intron. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed throughout the genome. | ribosomal protein S11 | 6205 | NA |
| CHGA | ENSG00000100604 | The protein encoded by this gene is a member of the chromogranin/secretogranin family of neuroendocrine secretory proteins. It is found in secretory vesicles of neurons and endocrine cells. This gene product is a precursor to three biologically active peptides; vasostatin, pancreastatin, and parastatin. These peptides act as autocrine or paracrine negative modulators of the neuroendocrine system. Two other peptides, catestatin and chromofungin, have antimicrobial activity and antifungal activity, respectively. Two transcript variants encoding different isoforms have been found for this gene. | chromogranin A | 1113 | NA |
| TSPAN13 | ENSG00000106537 | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. | tetraspanin 13 | 27075 | NA |
| ST14 | ENSG00000149418 | The protein encoded by this gene is an epithelial-derived, integral membrane serine protease. This protease forms a complex with the Kunitz-type serine protease inhibitor, HAI-1, and is found to be activated by sphingosine 1-phosphate. This protease has been shown to cleave and activate hepatocyte growth factor/scattering factor, and urokinase plasminogen activator, which suggest the function of this protease as an epithelial membrane activator for other proteases and latent growth factors. The expression of this protease has been associated with breast, colon, prostate, and ovarian tumors, which implicates its role in cancer invasion, and metastasis. | suppression of tumorigenicity 14 | 6768 | NA |
| CLDN7 | ENSG00000181885 | This gene encodes a member of the claudin family. Claudins are integral membrane proteins and components of tight junction strands. Tight junction strands serve as a physical barrier to prevent solutes and water from passing freely through the paracellular space between epithelial or endothelial cell sheets, and also play critical roles in maintaining cell polarity and signal transductions. Differential expression of this gene has been observed in different types of malignancies, including breast cancer, ovarian cancer, hepatocellular carcinomas, urinary tumors, prostate cancer, lung cancer, head and neck cancers, thyroid carcinomas, etc.. Alternatively spliced transcript variants encoding different isoforms have been found. | claudin 7 | 1366 | NA |
| IGLC1 | ENSG00000211675 | NA | immunoglobulin lambda constant 1 (Mcg marker) | ENSG00000211675 | NA |
| RPL10A | ENSG00000198755 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal protein that is a component of the 60S subunit. The protein belongs to the L1P family of ribosomal proteins. It is located in the cytoplasm. The expression of this gene is downregulated in the thymus by cyclosporin-A (CsA), an immunosuppressive drug. Studies in mice have shown that the expression of the ribosomal protein L10a gene is downregulated in neural precursor cells during development. This gene previously was referred to as NEDD6 (neural precursor cell expressed, developmentally downregulated 6), but it has been renamed RPL10A (ribosomal protein 10a). As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | ribosomal protein L10a | 4736 | NA |
| SAA2 | ENSG00000134339 | NA | serum amyloid A2 | 6289 | NA |
| STAB1 | ENSG00000010327 | This gene encodes a large, transmembrane receptor protein which may function in angiogenesis, lymphocyte homing, cell adhesion, or receptor scavenging. The protein contains 7 fasciclin, 16 epidermal growth factor (EGF)-like, and 2 laminin-type EGF-like domains as well as a C-type lectin-like hyaluronan-binding Link module. The protein is primarily expressed on sinusoidal endothelial cells of liver, spleen, and lymph node. The receptor has been shown to endocytose ligands such as low density lipoprotein, Gram-positive and Gram-negative bacteria, and advanced glycosylation end products. Supporting its possible role as a scavenger receptor, the protein rapidly cycles between the plasma membrane and early endosomes. | stabilin 1 | 23166 | NA |
| IGLL5 | ENSG00000254709 | This gene encodes one of the immunoglobulin lambda-like polypeptides. It is located within the immunoglobulin lambda locus but it does not require somatic rearrangement for expression. The first exon of this gene is unrelated to immunoglobulin variable genes; the second and third exons are the immunoglobulin lambda joining 1 and the immunoglobulin lambda constant 1 gene segments. Alternative splicing results in multiple transcript variants. | immunoglobulin lambda like polypeptide 5 | 100423062 | NA |
| CRYL1 | ENSG00000165475 | The uronate cycle functions as an alternative glucose metabolic pathway, accounting for about 5% of daily glucose catabolism. The product of this gene catalyzes the dehydrogenation of L-gulonate into dehydro-L-gulonate in the uronate cycle. The enzyme requires NAD(H) as a coenzyme, and is inhibited by inorganic phosphate. A similar gene in the rabbit is thought to serve a structural role in the lens of the eye. | crystallin lambda 1 | 51084 | NA |
| METTL7A | ENSG00000185432 | NA | methyltransferase like 7A | 25840 | NA |
| PEPD | ENSG00000124299 | This gene encodes a member of the peptidase family. The protein forms a homodimer that hydrolyzes dipeptides or tripeptides with C-terminal proline or hydroxyproline residues. The enzyme serves an important role in the recycling of proline, and may be rate limiting for the production of collagen. Mutations in this gene result in prolidase deficiency, which is characterized by the excretion of large amount of di- and tri-peptides containing proline. Multiple transcript variants encoding different isoforms have been found for this gene. | peptidase D | 5184 | NA |
| IQGAP2 | ENSG00000145703 | This gene encodes a member of the IQGAP family. The protein contains three IQ domains, one calponin homology domain, one Ras-GAP domain and one WW domain. It interacts with components of the cytoskeleton, with cell adhesion molecules, and with several signaling molecules to regulate cell morphology and motility. | IQ motif containing GTPase activating protein 2 | 10788 | NA |
| OLFM4 | ENSG00000102837 | This gene was originally cloned from human myeloblasts and found to be selectively expressed in inflammed colonic epithelium. This gene encodes a member of the olfactomedin family. The encoded protein is an antiapoptotic factor that promotes tumor growth and is an extracellular matrix glycoprotein that facilitates cell adhesion. | olfactomedin 4 | 10562 | NA |
| FRMD6 | ENSG00000139926 | NA | FERM domain containing 6 | 122786 | NA |
| RPLP1 | ENSG00000137818 | Ribosomes, the organelles that catalyze protein synthesis, consist of a small 40S subunit and a large 60S subunit. Together these subunits are composed of 4 RNA species and approximately 80 structurally distinct proteins. This gene encodes a ribosomal phosphoprotein that is a component of the 60S subunit. The protein, which is a functional equivalent of the E. coli L7/L12 ribosomal protein, belongs to the L12P family of ribosomal proteins. It plays an important role in the elongation step of protein synthesis. Unlike most ribosomal proteins, which are basic, the encoded protein is acidic. Its C-terminal end is nearly identical to the C-terminal ends of the ribosomal phosphoproteins P0 and P2. The P1 protein can interact with P0 and P2 to form a pentameric complex consisting of P1 and P2 dimers, and a P0 monomer. The protein is located in the cytoplasm. Two alternatively spliced transcript variants that encode different proteins have been observed. As is typical for genes encoding ribosomal proteins, there are multiple processed pseudogenes of this gene dispersed through the genome. | ribosomal protein lateral stalk subunit P1 | 6176 | NA |
| ACTA1 | ENSG00000143632 | The product encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Mutations in this gene cause nemaline myopathy type 3, congenital myopathy with excess of thin myofilaments, congenital myopathy with cores, and congenital myopathy with fiber-type disproportion, diseases that lead to muscle fiber defects. | actin, alpha 1, skeletal muscle | 58 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",16,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[17,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| name | query | symbol | summary | X_id | notfound |
|---|---|---|---|---|---|
| myelin basic protein | ENSG00000197971 | MBP | The protein encoded by the classic MBP gene is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. However, MBP-related transcripts are also present in the bone marrow and the immune system. These mRNAs arise from the long MBP gene (otherwise called ‘Golli-MBP’) that contains 3 additional exons located upstream of the classic MBP exons. Alternative splicing from the Golli and the MBP transcription start sites gives rise to 2 sets of MBP-related transcripts and gene products. The Golli mRNAs contain 3 exons unique to Golli-MBP, spliced in-frame to 1 or more MBP exons. They encode hybrid proteins that have N-terminal Golli aa sequence linked to MBP aa sequence. The second family of transcripts contain only MBP exons and produce the well characterized myelin basic proteins. This complex gene structure is conserved among species suggesting that the MBP transcription unit is an integral part of the Golli transcription unit and that this arrangement is important for the function and/or regulation of these genes. | 4155 | NA |
| NA | ENSG00000266844 | RP11-862L9.3 | NA | ENSG00000266844 | NA |
| glial fibrillary acidic protein | ENSG00000131095 | GFAP | This gene encodes one of the major intermediate filament proteins of mature astrocytes. It is used as a marker to distinguish astrocytes from other glial cells during development. Mutations in this gene cause Alexander disease, a rare disorder of astrocytes in the central nervous system. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | 2670 | NA |
| alanyl aminopeptidase, membrane | ENSG00000166825 | ANPEP | Aminopeptidase N is located in the small-intestinal and renal microvillar membrane, and also in other plasma membranes. In the small intestine aminopeptidase N plays a role in the final digestion of peptides generated from hydrolysis of proteins by gastric and pancreatic proteases. Its function in proximal tubular epithelial cells and other cell types is less clear. The large extracellular carboxyterminal domain contains a pentapeptide consensus sequence characteristic of members of the zinc-binding metalloproteinase superfamily. Sequence comparisons with known enzymes of this class showed that CD13 and aminopeptidase N are identical. The latter enzyme was thought to be involved in the metabolism of regulatory peptides by diverse cell types, including small intestinal and renal tubular epithelial cells, macrophages, granulocytes, and synaptic membranes from the CNS. Human aminopeptidase N is a receptor for one strain of human coronavirus that is an important cause of upper respiratory tract infections. Defects in this gene appear to be a cause of various types of leukemia or lymphoma. | 290 | NA |
| keratin 13 | ENSG00000171401 | KRT13 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. This type I cytokeratin is paired with keratin 4 and expressed in the suprabasal layers of non-cornified stratified epithelia. Mutations in this gene and keratin 4 have been associated with the autosomal dominant disorder White Sponge Nevus. The type I cytokeratins are clustered in a region of chromosome 17q21.2. Alternative splicing of this gene results in multiple transcript variants; however, not all variants have been described. | 3860 | NA |
| maturin, neural progenitor differentiation regulator homolog (Xenopus) | ENSG00000180354 | MTURN | NA | 222166 | NA |
| S100 calcium binding protein B | ENSG00000160307 | S100B | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21; however, this gene is located at 21q22.3. This protein may function in Neurite extension, proliferation of melanoma cells, stimulation of Ca2+ fluxes, inhibition of PKC-mediated phosphorylation, astrocytosis and axonal proliferation, and inhibition of microtubule assembly. Chromosomal rearrangements and altered expression of this gene have been implicated in several neurological, neoplastic, and other types of diseases, including Alzheimer’s disease, Down’s syndrome, epilepsy, amyotrophic lateral sclerosis, melanoma, and type I diabetes. | 6285 | NA |
| regulator of G-protein signaling 1 | ENSG00000090104 | RGS1 | This gene encodes a member of the regulator of G-protein signalling family. This protein is located on the cytosolic side of the plasma membrane and contains a conserved, 120 amino acid motif called the RGS domain. The protein attenuates the signalling activity of G-proteins by binding to activated, GTP-bound G alpha subunits and acting as a GTPase activating protein (GAP), increasing the rate of conversion of the GTP to GDP. This hydrolysis allows the G alpha subunits to bind G beta/gamma subunit heterodimers, forming inactive G-protein heterotrimers, thereby terminating the signal. | 5996 | NA |
| latent transforming growth factor beta binding protein 4 | ENSG00000090006 | LTBP4 | The protein encoded by this gene binds transforming growth factor beta (TGFB) as it is secreted and targeted to the extracellular matrix. TGFB is biologically latent after secretion and insertion into the extracellular matrix, and sheds TGFB and other proteins upon activation. Defects in this gene may be a cause of cutis laxa and severe pulmonary, gastrointestinal, and urinary abnormalities. Three transcript variants encoding different isoforms have been found for this gene. | 8425 | NA |
| collagen type I alpha 1 | ENSG00000108821 | COL1A1 | This gene encodes the pro-alpha1 chains of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIA, Ehlers-Danlos syndrome Classical type, Caffey Disease and idiopathic osteoporosis. Reciprocal translocations between chromosomes 17 and 22, where this gene and the gene for platelet-derived growth factor beta are located, are associated with a particular type of skin tumor called dermatofibrosarcoma protuberans, resulting from unregulated expression of the growth factor. Two transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | 1277 | NA |
| N-myc downstream regulated 1 | ENSG00000104419 | NDRG1 | This gene is a member of the N-myc downregulated gene family which belongs to the alpha/beta hydrolase superfamily. The protein encoded by this gene is a cytoplasmic protein involved in stress responses, hormone responses, cell growth, and differentiation. The encoded protein is necessary for p53-mediated caspase activation and apoptosis. Mutations in this gene are a cause of Charcot-Marie-Tooth disease type 4D, and expression of this gene may be a prognostic indicator for several types of cancer. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 10397 | NA |
| keratin 4 | ENSG00000170477 | KRT4 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in differentiated layers of the mucosal and esophageal epithelia with family member KRT13. Mutations in these genes have been associated with White Sponge Nevus, characterized by oral, esophageal, and anal leukoplakia. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | 3851 | NA |
| heat shock protein 90kDa alpha family class A member 1 | ENSG00000080824 | HSP90AA1 | The protein encoded by this gene is an inducible molecular chaperone that functions as a homodimer. The encoded protein aids in the proper folding of specific target proteins by use of an ATPase activity that is modulated by co-chaperones. Two transcript variants encoding different isoforms have been found for this gene. | 3320 | NA |
| claudin domain containing 1 | ENSG00000080822 | CLDND1 | NA | 56650 | NA |
| pleckstrin homology domain containing B1 | ENSG00000021300 | PLEKHB1 | NA | 58473 | NA |
| collagen type I alpha 2 chain | ENSG00000164692 | COL1A2 | This gene encodes the pro-alpha2 chain of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIB, recessive Ehlers-Danlos syndrome Classical type, idiopathic osteoporosis, and atypical Marfan syndrome. Symptoms associated with mutations in this gene, however, tend to be less severe than mutations in the gene for the alpha1 chain of type I collagen (COL1A1) reflecting the different role of alpha2 chains in matrix integrity. Three transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | 1278 | NA |
| membrane metallo-endopeptidase | ENSG00000196549 | MME | This gene encodes a common acute lymphocytic leukemia antigen that is an important cell surface marker in the diagnosis of human acute lymphocytic leukemia (ALL). This protein is present on leukemic cells of pre-B phenotype, which represent 85% of cases of ALL. This protein is not restricted to leukemic cells, however, and is found on a variety of normal tissues. It is a glycoprotein that is particularly abundant in kidney, where it is present on the brush border of proximal tubules and on glomerular epithelium. The protein is a neutral endopeptidase that cleaves peptides at the amino side of hydrophobic residues and inactivates several peptide hormones including glucagon, enkephalins, substance P, neurotensin, oxytocin, and bradykinin. This gene, which encodes a 100-kD type II transmembrane glycoprotein, exists in a single copy of greater than 45 kb. The 5’ untranslated region of this gene is alternatively spliced, resulting in four separate mRNA transcripts. The coding region is not affected by alternative splicing. | 4311 | NA |
| tumor protein p53 inducible nuclear protein 2 | ENSG00000078804 | TP53INP2 | NA | 58476 | NA |
| progestin and adipoQ receptor family member 6 | ENSG00000160781 | PAQR6 | NA | 79957 | NA |
| eukaryotic translation elongation factor 2 | ENSG00000167658 | EEF2 | This gene encodes a member of the GTP-binding translation elongation factor family. This protein is an essential factor for protein synthesis. It promotes the GTP-dependent translocation of the nascent protein chain from the A-site to the P-site of the ribosome. This protein is completely inactivated by EF-2 kinase phosporylation. | 1938 | NA |
| laminin subunit alpha 5 | ENSG00000130702 | LAMA5 | This gene encodes one of the vertebrate laminin alpha chains. Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Laminins are composed of 3 non identical chains: laminin alpha, beta and gamma (formerly A, B1, and B2, respectively) and they form a cruciform structure consisting of 3 short arms, each formed by a different chain, and a long arm composed of all 3 chains. Each laminin chain is a multidomain protein encoded by a distinct gene. The protein encoded by this gene is the alpha-5 subunit of of laminin-10 (laminin-511), laminin-11 (laminin-521) and laminin-15 (laminin-523). | 3911 | NA |
| CD9 molecule | ENSG00000010278 | CD9 | This gene encodes a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Tetraspanins are cell surface glycoproteins with four transmembrane domains that form multimeric complexes with other cell surface proteins. The encoded protein functions in many cellular processes including differentiation, adhesion, and signal transduction, and expression of this gene plays a critical role in the suppression of cancer cell motility and metastasis. | 928 | NA |
| prostaglandin D2 synthase | ENSG00000107317 | PTGDS | The protein encoded by this gene is a glutathione-independent prostaglandin D synthase that catalyzes the conversion of prostaglandin H2 (PGH2) to postaglandin D2 (PGD2). PGD2 functions as a neuromodulator as well as a trophic factor in the central nervous system. PGD2 is also involved in smooth muscle contraction/relaxation and is a potent inhibitor of platelet aggregation. This gene is preferentially expressed in brain. Studies with transgenic mice overexpressing this gene suggest that this gene may be also involved in the regulation of non-rapid eye movement sleep. | 5730 | NA |
| ZFP36 ring finger protein | ENSG00000128016 | ZFP36 | NA | 7538 | NA |
| myelin protein zero | ENSG00000158887 | MPZ | This gene is specifically expressed in Schwann cells of the peripheral nervous system and encodes a type I transmembrane glycoprotein that is a major structural protein of the peripheral myelin sheath. The encoded protein contains a large hydrophobic extracellular domain and a smaller basic intracellular domain, which are essential for the formation and stabilization of the multilamellar structure of the compact myelin. Mutations in this gene are associated with autosomal dominant form of Charcot-Marie-Tooth disease type 1 (CMT1B) and other polyneuropathies, such as Dejerine-Sottas syndrome (DSS) and congenital hypomyelinating neuropathy (CHN). A recent study showed that two isoforms are produced from the same mRNA by use of alternative in-frame translation termination codons via a stop codon readthrough mechanism. | 4359 | NA |
| NA | ENSG00000229732 | AC019349.5 | NA | ENSG00000229732 | NA |
| septin 4 | ENSG00000108387 | SEPT4 | This gene is a member of the septin family of nucleotide binding proteins, originally described in yeast as cell division cycle regulatory proteins. Septins are highly conserved in yeast, Drosophila, and mouse, and appear to regulate cytoskeletal organization. Disruption of septin function disturbs cytokinesis and results in large multinucleate or polyploid cells. This gene is highly expressed in brain and heart. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. One of the isoforms (known as ARTS) is distinct; it is localized to the mitochondria, and has a role in apoptosis and cancer. | 5414 | NA |
| interleukin 1 receptor antagonist | ENSG00000136689 | IL1RN | The protein encoded by this gene is a member of the interleukin 1 cytokine family. This protein inhibits the activities of interleukin 1, alpha (IL1A) and interleukin 1, beta (IL1B), and modulates a variety of interleukin 1 related immune and inflammatory responses. This gene and five other closely related cytokine genes form a gene cluster spanning approximately 400 kb on chromosome 2. A polymorphism of this gene is reported to be associated with increased risk of osteoporotic fractures and gastric cancer. Several alternatively spliced transcript variants encoding distinct isoforms have been reported. | 3557 | NA |
| cystatin A | ENSG00000121552 | CSTA | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins, and kininogens. This gene encodes a stefin that functions as a cysteine protease inhibitor, forming tight complexes with papain and the cathepsins B, H, and L. The protein is one of the precursor proteins of cornified cell envelope in keratinocytes and plays a role in epidermal development and maintenance. Stefins have been proposed as prognostic and diagnostic tools for cancer. | 1475 | NA |
| plakophilin 4 | ENSG00000144283 | PKP4 | Armadillo-like proteins are characterized by a series of armadillo repeats, first defined in the Drosophila ‘armadillo’ gene product, that are typically 42 to 45 amino acids in length. These proteins can be divided into subfamilies based on their number of repeats, their overall sequence similarity, and the dispersion of the repeats throughout their sequences. Members of the p120(ctn)/plakophilin subfamily of Armadillo-like proteins, including CTNND1, CTNND2, PKP1, PKP2, PKP4, and ARVCF. PKP4 may be a component of desmosomal plaque and other adhesion plaques and is thought to be involved in regulating junctional plaque organization and cadherin function. Multiple transcript variants encoding different isoforms have been found for this gene. | 8502 | NA |
| CD63 molecule | ENSG00000135404 | CD63 | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. The encoded protein is a cell surface glycoprotein that is known to complex with integrins. It may function as a blood platelet activation marker. Deficiency of this protein is associated with Hermansky-Pudlak syndrome. Also this gene has been associated with tumor progression. Alternative splicing results in multiple transcript variants encoding different protein isoforms. | 967 | NA |
| ankyrin repeat domain 1 | ENSG00000148677 | ANKRD1 | The protein encoded by this gene is localized to the nucleus of endothelial cells and is induced by IL-1 and TNF-alpha stimulation. Studies in rat cardiomyocytes suggest that this gene functions as a transcription factor. Interactions between this protein and the sarcomeric proteins myopalladin and titin suggest that it may also be involved in the myofibrillar stretch-sensor system. | 27063 | NA |
| small proline rich protein 3 | ENSG00000163209 | SPRR3 | NA | 6707 | NA |
| pentraxin 3 | ENSG00000163661 | PTX3 | NA | 5806 | NA |
| endoplasmic reticulum-golgi intermediate compartment 1 | ENSG00000113719 | ERGIC1 | This gene encodes a cycling membrane protein which is an endoplasmic reticulum-golgi intermediate compartment (ERGIC) protein which interacts with other members of this protein family to increase their turnover. | 57222 | NA |
| S100 calcium binding protein A9 | ENSG00000163220 | S100A9 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and altered expression of this protein is associated with the disease cystic fibrosis. This antimicrobial protein exhibits antifungal and antibacterial activity. | 6280 | NA |
| serpin family B member 9 | ENSG00000170542 | SERPINB9 | This gene encodes a member of the serine protease inhibitor family which are also known as serpins. The encoded protein belongs to a subfamily of intracellular serpins. This protein inhibits the activity of the effector molecule granzyme B. Overexpression of this protein may prevent cytotoxic T-lymphocytes from eliminating certain tumor cells. A pseudogene of this gene is found on chromosome 6. | 5272 | NA |
| obscurin-like 1 | ENSG00000124006 | OBSL1 | Cytoskeletal adaptor proteins function in linking the internal cytoskeleton of cells to the cell membrane. This gene encodes a cytoskeletal adaptor protein, which is a member of the Unc-89/obscurin family. The protein contains multiple N- and C-terminal immunoglobulin (Ig)-like domains and a central fibronectin type 3 domain. Mutations in this gene cause 3M syndrome type 2. Alternatively spliced transcript variants encoding different isoforms have been found in this gene. | 23363 | NA |
| StAR related lipid transfer domain containing 9 | ENSG00000159433 | STARD9 | NA | 57519 | NA |
| XIAP associated factor 1 | ENSG00000132530 | XAF1 | This gene encodes a protein which binds to and counteracts the inhibitory effect of a member of the IAP (inhibitor of apoptosis) protein family. IAP proteins bind to and inhibit caspases which are activated during apoptosis. The proportion of IAPs and proteins which interfere with their activity, such as the encoded protein, affect the progress of the apoptosis signaling pathway. Multiple transcript variants encoding different isoforms have been found for this gene. | 54739 | NA |
| basic helix-loop-helix family member e40 | ENSG00000134107 | BHLHE40 | This gene encodes a basic helix-loop-helix protein expressed in various tissues. The encoded protein can interact with ARNTL or compete for E-box binding sites in the promoter of PER1 and repress CLOCK/ARNTL’s transactivation of PER1. This gene is believed to be involved in the control of circadian rhythm and cell differentiation. | 8553 | NA |
| cystatin B | ENSG00000160213 | CSTB | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins and kininogens. This gene encodes a stefin that functions as an intracellular thiol protease inhibitor. The protein is able to form a dimer stabilized by noncovalent forces, inhibiting papain and cathepsins l, h and b. The protein is thought to play a role in protecting against the proteases leaking from lysosomes. Evidence indicates that mutations in this gene are responsible for the primary defects in patients with progressive myoclonic epilepsy (EPM1). | 1476 | NA |
| hypoxia inducible lipid droplet associated | ENSG00000135245 | HILPDA | NA | 29923 | NA |
| thioredoxin reductase 1 | ENSG00000198431 | TXNRD1 | This gene encodes a member of the family of pyridine nucleotide oxidoreductases. This protein reduces thioredoxins as well as other substrates, and plays a role in selenium metabolism and protection against oxidative stress. The functional enzyme is thought to be a homodimer which uses FAD as a cofactor. Each subunit contains a selenocysteine (Sec) residue which is required for catalytic activity. The selenocysteine is encoded by the UGA codon that normally signals translation termination. The 3’ UTR of selenocysteine-containing genes have a common stem-loop structure, the sec insertion sequence (SECIS), that is necessary for the recognition of UGA as a Sec codon rather than as a stop signal. Alternative splicing results in several transcript variants encoding the same or different isoforms. | 7296 | NA |
| transferrin | ENSG00000091513 | TF | This gene encodes a glycoprotein with an approximate molecular weight of 76.5 kDa. It is thought to have been created as a result of an ancient gene duplication event that led to generation of homologous C and N-terminal domains each of which binds one ion of ferric iron. The function of this protein is to transport iron from the intestine, reticuloendothelial system, and liver parenchymal cells to all proliferating cells in the body. This protein may also have a physiologic role as granulocyte/pollen-binding protein (GPBP) involved in the removal of certain organic matter and allergens from serum. | 7018 | NA |
| periaxin | ENSG00000105227 | PRX | This gene encodes a protein involved in peripheral nerve myelin upkeep. The encoded protein contains 2 PDZ domains which were named after PSD95 (post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1 (a mammalian tight junction protein). Two alternatively spliced transcript variants have been described for this gene which encode different protein isoforms and which are targeted differently in the Schwann cell. Mutations in this gene cause Charcot-Marie-Tooth neuoropathy, type 4F and Dejerine-Sottas neuropathy. | 57716 | NA |
| semaphorin 4C | ENSG00000168758 | SEMA4C | NA | 54910 | NA |
| IKAROS family zinc finger 2 | ENSG00000030419 | IKZF2 | This gene encodes a member of the Ikaros family of zinc-finger proteins. Three members of this protein family (Ikaros, Aiolos and Helios) are hematopoietic-specific transcription factors involved in the regulation of lymphocyte development. This protein forms homo- or hetero-dimers with other Ikaros family members, and is thought to function predominantly in early hematopoietic development. Multiple transcript variants encoding different isoforms have been found for this gene, but the biological validity of some variants has not been determined. | 22807 | NA |
| WNK lysine deficient protein kinase 1 | ENSG00000060237 | WNK1 | This gene encodes a member of the WNK subfamily of serine/threonine protein kinases. The encoded protein may be a key regulator of blood pressure by controlling the transport of sodium and chloride ions. Mutations in this gene have been associated with pseudohypoaldosteronism type II and hereditary sensory neuropathy type II. Alternatively spliced transcript variants encoding different isoforms have been described but the full-length nature of all of them has yet to be determined. | 65125 | NA |
| spectrin beta, non-erythrocytic 1 | ENSG00000115306 | SPTBN1 | Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. This gene is one member of a family of beta-spectrin genes. The encoded protein contains an N-terminal actin-binding domain, and 17 spectrin repeats which are involved in dimer formation. Multiple transcript variants encoding different isoforms have been found for this gene. | 6711 | NA |
| high density lipoprotein binding protein | ENSG00000115677 | HDLBP | The protein encoded by this gene binds high density lipoprotein (HDL) and may function to regulate excess cholesterol levels in cells. The encoded protein also binds RNA and can induce heterochromatin formation. | 3069 | NA |
| myelin protein zero like 2 | ENSG00000149573 | MPZL2 | Thymus development depends on a complex series of interactions between thymocytes and the stromal component of the organ. Epithelial V-like antigen (EVA) is expressed in thymus epithelium and strongly downregulated by thymocyte developmental progression. This gene is expressed in the thymus and in several epithelial structures early in embryogenesis. It is highly homologous to the myelin protein zero and, in thymus-derived epithelial cell lines, is poorly soluble in nonionic detergents, strongly suggesting an association to the cytoskeleton. Its capacity to mediate cell adhesion through a homophilic interaction and its selective regulation by T cell maturation might imply the participation of EVA in the earliest phases of thymus organogenesis. The protein bears a characteristic V-type domain and two potential N-glycosylation sites in the extracellular domain; a putative serine phosphorylation site for casein kinase 2 is also present in the cytoplasmic tail. Two transcript variants encoding the same protein have been found for this gene. | 10205 | NA |
| small ArfGAP2 | ENSG00000084070 | SMAP2 | NA | 64744 | NA |
| vimentin | ENSG00000026025 | VIM | This gene encodes a member of the intermediate filament family. Intermediate filamentents, along with microtubules and actin microfilaments, make up the cytoskeleton. The protein encoded by this gene is responsible for maintaining cell shape, integrity of the cytoplasm, and stabilizing cytoskeletal interactions. It is also involved in the immune response, and controls the transport of low-density lipoprotein (LDL)-derived cholesterol from a lysosome to the site of esterification. It functions as an organizer of a number of critical proteins involved in attachment, migration, and cell signaling. Mutations in this gene causes a dominant, pulverulent cataract. | 7431 | NA |
| arginine and glutamate rich 1 | ENSG00000134884 | ARGLU1 | NA | 55082 | NA |
| dystonin | ENSG00000151914 | DST | This gene encodes a member of the plakin protein family of adhesion junction plaque proteins. Multiple alternatively spliced transcript variants encoding distinct isoforms have been found for this gene, but the full-length nature of some variants has not been defined. It has been reported that some isoforms are expressed in neural and muscle tissue, anchoring neural intermediate filaments to the actin cytoskeleton, and some isoforms are expressed in epithelial tissue, anchoring keratin-containing intermediate filaments to hemidesmosomes. Consistent with the expression, mice defective for this gene show skin blistering and neurodegeneration. | 667 | NA |
| DYX1C1-CCPG1 readthrough (NMD candidate) | ENSG00000261771 | DYX1C1-CCPG1 | This locus represents naturally occurring read-through transcription between the neighboring dyslexia susceptibility 1 candidate 1 (DYX1C1) and cell cycle progression 1 (CCPG1) genes on chromosome 15. The read-through transcript is a candidate for nonsense-mediated mRNA decay (NMD), and is thus unlikely to produce a protein product. | 100533483 | NA |
| 2’-5’-oligoadenylate synthetase 3 | ENSG00000111331 | OAS3 | This gene encodes an enzyme included in the 2’, 5’ oligoadenylate synthase family. This enzyme is induced by interferons and catalyzes the 2’, 5’ oligomers of adenosine in order to bind and activate RNase L. This enzyme family plays a significant role in the inhibition of cellular protein synthesis and viral infection resistance. | 4940 | NA |
| brain protein I3 | ENSG00000164713 | BRI3 | NA | 25798 | NA |
| CDC like kinase 1 | ENSG00000013441 | CLK1 | This gene encodes a member of the CDC2-like (or LAMMER) family of dual specificity protein kinases. In the nucleus, the encoded protein phosphorylates serine/arginine-rich proteins involved in pre-mRNA processing, releasing them into the nucleoplasm. The choice of splice sites during pre-mRNA processing may be regulated by the concentration of transacting factors, including serine/arginine rich proteins. Therefore, the encoded protein may play an indirect role in governing splice site selection. Multiple transcript variants encoding different isoforms have been found for this gene. | 1195 | NA |
| titin | ENSG00000155657 | TTN | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | 7273 | NA |
| colony stimulating factor 3 receptor | ENSG00000119535 | CSF3R | The protein encoded by this gene is the receptor for colony stimulating factor 3, a cytokine that controls the production, differentiation, and function of granulocytes. The encoded protein, which is a member of the family of cytokine receptors, may also function in some cell surface adhesion or recognition processes. Alternatively spliced transcript variants have been described. Mutations in this gene are a cause of Kostmann syndrome, also known as severe congenital neutropenia. | 1441 | NA |
| carboxypeptidase A1 | ENSG00000091704 | CPA1 | This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. | 1357 | NA |
| von Willebrand factor A domain containing 1 | ENSG00000179403 | VWA1 | VWA1 belongs to the von Willebrand factor (VWF; MIM 613160) A (VWFA) domain superfamily of extracellular matrix proteins and appears to play a role in cartilage structure and function (Fitzgerald et al., 2002 [PubMed 12062410]). | 64856 | NA |
| heat shock protein family A (Hsp70) member 1B | ENSG00000204388 | HSPA1B | This intronless gene encodes a 70kDa heat shock protein which is a member of the heat shock protein 70 family. In conjuction with other heat shock proteins, this protein stabilizes existing proteins against aggregation and mediates the folding of newly translated proteins in the cytosol and in organelles. It is also involved in the ubiquitin-proteasome pathway through interaction with the AU-rich element RNA-binding protein 1. The gene is located in the major histocompatibility complex class III region, in a cluster with two closely related genes which encode similar proteins. | 3304 | NA |
| prolyl 4-hydroxylase subunit beta | ENSG00000185624 | P4HB | This gene encodes the beta subunit of prolyl 4-hydroxylase, a highly abundant multifunctional enzyme that belongs to the protein disulfide isomerase family. When present as a tetramer consisting of two alpha and two beta subunits, this enzyme is involved in hydroxylation of prolyl residues in preprocollagen. This enzyme is also a disulfide isomerase containing two thioredoxin domains that catalyze the formation, breakage and rearrangement of disulfide bonds. Other known functions include its ability to act as a chaperone that inhibits aggregation of misfolded proteins in a concentration-dependent manner, its ability to bind thyroid hormone, its role in both the influx and efflux of S-nitrosothiol-bound nitric oxide, and its function as a subunit of the microsomal triglyceride transfer protein complex. | 5034 | NA |
| cytochrome P450 family 17 subfamily A member 1 | ENSG00000148795 | CYP17A1 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum. It has both 17alpha-hydroxylase and 17,20-lyase activities and is a key enzyme in the steroidogenic pathway that produces progestins, mineralocorticoids, glucocorticoids, androgens, and estrogens. Mutations in this gene are associated with isolated steroid-17 alpha-hydroxylase deficiency, 17-alpha-hydroxylase/17,20-lyase deficiency, pseudohermaphroditism, and adrenal hyperplasia. | 1586 | NA |
| LRRC75A antisense RNA 1 | ENSG00000175061 | LRRC75A-AS1 | NA | 125144 | NA |
| LDL receptor related protein associated protein 1 | ENSG00000163956 | LRPAP1 | This gene encodes a protein that interacts with the low density lipoprotein (LDL) receptor-related protein and facilitates its proper folding and localization by preventing the binding of ligands. Mutations in this gene have been identified in individuals with myopia 23. Alternative splicing results in multiple transcript variants. | 4043 | NA |
| acyl-CoA synthetase long-chain family member 5 | ENSG00000197142 | ACSL5 | The protein encoded by this gene is an isozyme of the long-chain fatty-acid-coenzyme A ligase family. Although differing in substrate specificity, subcellular localization, and tissue distribution, all isozymes of this family convert free long-chain fatty acids into fatty acyl-CoA esters, and thereby play a key role in lipid biosynthesis and fatty acid degradation. This isozyme is highly expressed in uterus and spleen, and in trace amounts in normal brain, but has markedly increased levels in malignant gliomas. This gene functions in mediating fatty acid-induced glioma cell growth. Three transcript variants encoding two different isoforms have been found for this gene. | 51703 | NA |
| cornulin | ENSG00000143536 | CRNN | This gene encodes a member of the ‘fused gene’ family of proteins, which contain N-terminus EF-hand domains and multiple tandem peptide repeats. The encoded protein contains two EF-hand Ca2+ binding domains in its N-terminus and two glutamine- and threonine-rich 60 amino acid repeats in its C-terminus. This gene, also known as squamous epithelial heat shock protein 53, may play a role in the mucosal/epithelial immune response and epidermal differentiation. | 49860 | NA |
| activating transcription factor 3 | ENSG00000162772 | ATF3 | This gene encodes a member of the mammalian activation transcription factor/cAMP responsive element-binding (CREB) protein family of transcription factors. This gene is induced by a variety of signals, including many of those encountered by cancer cells, and is involved in the complex process of cellular stress response. Multiple transcript variants encoding different isoforms have been found for this gene. It is possible that alternative splicing of this gene may be physiologically important in the regulation of target genes. | 467 | NA |
| QKI, KH domain containing, RNA binding | ENSG00000112531 | QKI | The protein encoded by this gene is an RNA-binding protein that regulates pre-mRNA splicing, export of mRNAs from the nucleus, protein translation, and mRNA stability. The encoded protein is involved in myelinization and oligodendrocyte differentiation and may play a role in schizophrenia. Multiple transcript variants encoding different isoforms have been found for this gene. | 9444 | NA |
| regenerating family member 1 alpha | ENSG00000115386 | REG1A | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | 5967 | NA |
| transforming growth factor beta induced | ENSG00000120708 | TGFBI | This gene encodes an RGD-containing protein that binds to type I, II and IV collagens. The RGD motif is found in many extracellular matrix proteins modulating cell adhesion and serves as a ligand recognition sequence for several integrins. This protein plays a role in cell-collagen interactions and may be involved in endochondrial bone formation in cartilage. The protein is induced by transforming growth factor-beta and acts to inhibit cell adhesion. Mutations in this gene are associated with multiple types of corneal dystrophy. | 7045 | NA |
| protease, serine 1 | ENSG00000204983 | PRSS1 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | 5644 | NA |
| heterogeneous nuclear ribonucleoprotein A3 | ENSG00000170144 | HNRNPA3 | NA | 220988 | NA |
| 2’-5’-oligoadenylate synthetase 2 | ENSG00000111335 | OAS2 | This gene encodes a member of the 2-5A synthetase family, essential proteins involved in the innate immune response to viral infection. The encoded protein is induced by interferons and uses adenosine triphosphate in 2’-specific nucleotidyl transfer reactions to synthesize 2’,5’-oligoadenylates (2-5As). These molecules activate latent RNase L, which results in viral RNA degradation and the inhibition of viral replication. The three known members of this gene family are located in a cluster on chromosome 12. Alternatively spliced transcript variants encoding different isoforms have been described. | 4939 | NA |
| phosphatidylinositol-5-phosphate 4-kinase type 2 alpha | ENSG00000150867 | PIP4K2A | Phosphatidylinositol-5,4-bisphosphate, the precursor to second messengers of the phosphoinositide signal transduction pathways, is thought to be involved in the regulation of secretion, cell proliferation, differentiation, and motility. The protein encoded by this gene is one of a family of enzymes capable of catalyzing the phosphorylation of phosphatidylinositol-5-phosphate on the fourth hydroxyl of the myo-inositol ring to form phosphatidylinositol-5,4-bisphosphate. The amino acid sequence of this enzyme does not show homology to other kinases, but the recombinant protein does exhibit kinase activity. This gene is a member of the phosphatidylinositol-5-phosphate 4-kinase family. | 5305 | NA |
| selectin P ligand | ENSG00000110876 | SELPLG | This gene encodes a glycoprotein that functions as a high affinity counter-receptor for the cell adhesion molecules P-, E- and L- selectin expressed on myeloid cells and stimulated T lymphocytes. As such, this protein plays a critical role in leukocyte trafficking during inflammation by tethering of leukocytes to activated platelets or endothelia expressing selectins. This protein requires two post-translational modifications, tyrosine sulfation and the addition of the sialyl Lewis x tetrasaccharide (sLex) to its O-linked glycans, for its high-affinity binding activity. Aberrant expression of this gene and polymorphisms in this gene are associated with defects in the innate and adaptive immune response. Alternate splicing results in multiple transcript variants. | 6404 | NA |
| keratin 10 | ENSG00000186395 | KRT10 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | 3858 | NA |
| OS9, endoplasmic reticulum lectin | ENSG00000135506 | OS9 | This gene encodes a protein that is highly expressed in osteosarcomas. This protein binds to the hypoxia-inducible factor 1 (HIF-1), a key regulator of the hypoxic response and angiogenesis, and promotes the degradation of one of its subunits. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. | 10956 | NA |
| lysophosphatidic acid receptor 6 | ENSG00000139679 | LPAR6 | The protein encoded by this gene belongs to the family of G-protein coupled receptors, that are preferentially activated by adenosine and uridine nucleotides. This gene aligns with an internal intron of the retinoblastoma susceptibility gene in the reverse orientation. Alternative splicing results in multiple transcript variants. | 10161 | NA |
| lipase F, gastric type | ENSG00000182333 | LIPF | This gene encodes gastric lipase, an enzyme involved in the digestion of dietary triglycerides in the gastrointestinal tract, and responsible for 30% of fat digestion processes occurring in human. It is secreted by gastric chief cells in the fundic mucosa of the stomach, and it hydrolyzes the ester bonds of triglycerides under acidic pH conditions. The gene is a member of a conserved gene family of lipases that play distinct roles in neutral lipid metabolism. Several transcript variants encoding different isoforms have been found for this gene. | 8513 | NA |
| TAR DNA binding protein | ENSG00000120948 | TARDBP | HIV-1, the causative agent of acquired immunodeficiency syndrome (AIDS), contains an RNA genome that produces a chromosomally integrated DNA during the replicative cycle. Activation of HIV-1 gene expression by the transactivator Tat is dependent on an RNA regulatory element (TAR) located downstream of the transcription initiation site. The protein encoded by this gene is a transcriptional repressor that binds to chromosomally integrated TAR DNA and represses HIV-1 transcription. In addition, this protein regulates alternate splicing of the CFTR gene. A similar pseudogene is present on chromosome 20. | 23435 | NA |
| adipocyte plasma membrane associated protein | ENSG00000101474 | APMAP | NA | 57136 | NA |
| phosphatidylinositol-4-phosphate 3-kinase catalytic subunit type 2 beta | ENSG00000133056 | PIK3C2B | The protein encoded by this gene belongs to the phosphoinositide 3-kinase (PI3K) family. PI3-kinases play roles in signaling pathways involved in cell proliferation, oncogenic transformation, cell survival, cell migration, and intracellular protein trafficking. This protein contains a lipid kinase catalytic domain as well as a C-terminal C2 domain, a characteristic of class II PI3-kinases. C2 domains act as calcium-dependent phospholipid binding motifs that mediate translocation of proteins to membranes, and may also mediate protein-protein interactions. The PI3-kinase activity of this protein is sensitive to low nanomolar levels of the inhibitor wortmanin. The C2 domain of this protein was shown to bind phospholipids but not Ca2+, which suggests that this enzyme may function in a calcium-independent manner. | 5287 | NA |
| myeloid cell nuclear differentiation antigen | ENSG00000163563 | MNDA | The myeloid cell nuclear differentiation antigen (MNDA) is detected only in nuclei of cells of the granulocyte-monocyte lineage. A 200-amino acid region of human MNDA is strikingly similar to a region in the proteins encoded by a family of interferon-inducible mouse genes, designated Ifi-201, Ifi-202, and Ifi-203, that are not regulated in a cell- or tissue-specific fashion. The 1.8-kb MNDA mRNA, which contains an interferon-stimulated response element in the 5-prime untranslated region, was significantly upregulated in human monocytes exposed to interferon alpha. MNDA is located within 2,200 kb of FCER1A, APCS, CRP, and SPTA1. In its pattern of expression and/or regulation, MNDA resembles IFI16, suggesting that these genes participate in blood cell-specific responses to interferons. | 4332 | NA |
| syndecan 1 | ENSG00000115884 | SDC1 | The protein encoded by this gene is a transmembrane (type I) heparan sulfate proteoglycan and is a member of the syndecan proteoglycan family. The syndecans mediate cell binding, cell signaling, and cytoskeletal organization and syndecan receptors are required for internalization of the HIV-1 tat protein. The syndecan-1 protein functions as an integral membrane protein and participates in cell proliferation, cell migration and cell-matrix interactions via its receptor for extracellular matrix proteins. Altered syndecan-1 expression has been detected in several different tumor types. While several transcript variants may exist for this gene, the full-length natures of only two have been described to date. These two represent the major variants of this gene and encode the same protein. | 6382 | NA |
| chymotrypsin like elastase family member 3A | ENSG00000142789 | CELA3A | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. | 10136 | NA |
| ornithine decarboxylase antizyme 1 | ENSG00000104904 | OAZ1 | The protein encoded by this gene belongs to the ornithine decarboxylase antizyme family, which plays a role in cell growth and proliferation by regulating intracellular polyamine levels. Expression of antizymes requires +1 ribosomal frameshifting, which is enhanced by high levels of polyamines. Antizymes in turn bind to and inhibit ornithine decarboxylase (ODC), the key enzyme in polyamine biosynthesis; thus, completing the auto-regulatory circuit. This gene encodes antizyme 1, the first member of the antizyme family, that has broad tissue distribution, and negatively regulates intracellular polyamine levels by binding to and targeting ODC for degradation, as well as inhibiting polyamine uptake. Antizyme 1 mRNA contains two potential in-frame AUGs; and studies in rat suggest that alternative use of the two translation initiation sites results in N-terminally distinct protein isoforms with different subcellular localization. Alternatively spliced transcript variants have also been noted for this gene. | 4946 | NA |
| misshapen like kinase 1 | ENSG00000141503 | MINK1 | This gene encodes a serine/threonine kinase belonging to the germinal center kinase (GCK) family. The protein is structurally similar to the kinases that are related to NIK and may belong to a distinct subfamily of NIK-related kinases within the GCK family. Studies of the mouse homolog indicate an up-regulation of expression in the course of postnatal mouse cerebral development and activation of the cJun N-terminal kinase (JNK) and the p38 pathways. | 50488 | NA |
| AHNAK nucleoprotein | ENSG00000124942 | AHNAK | NA | 79026 | NA |
| Rap guanine nucleotide exchange factor 5 | ENSG00000136237 | RAPGEF5 | Members of the RAS (see HRAS; MIM 190020) subfamily of GTPases function in signal transduction as GTP/GDP-regulated switches that cycle between inactive GDP- and active GTP-bound states. Guanine nucleotide exchange factors (GEFs), such as RAPGEF5, serve as RAS activators by promoting acquisition of GTP to maintain the active GTP-bound state and are the key link between cell surface receptors and RAS activation (Rebhun et al., 2000 [PubMed 10934204]). | 9771 | NA |
| intercellular adhesion molecule 3 | ENSG00000076662 | ICAM3 | The protein encoded by this gene is a member of the intercellular adhesion molecule (ICAM) family. All ICAM proteins are type I transmembrane glycoproteins, contain 2-9 immunoglobulin-like C2-type domains, and bind to the leukocyte adhesion LFA-1 protein. This protein is constitutively and abundantly expressed by all leucocytes and may be the most important ligand for LFA-1 in the initiation of the immune response. It functions not only as an adhesion molecule, but also as a potent signalling molecule. Alternative splicing results in multiple transcript variants encoding different isoforms. | 3385 | NA |
| baculoviral IAP repeat containing 3 | ENSG00000023445 | BIRC3 | This gene encodes a member of the IAP family of proteins that inhibit apoptosis by binding to tumor necrosis factor receptor-associated factors TRAF1 and TRAF2, probably by interfering with activation of ICE-like proteases. The encoded protein inhibits apoptosis induced by serum deprivation but does not affect apoptosis resulting from exposure to menadione, a potent inducer of free radicals. It contains 3 baculovirus IAP repeats and a ring finger domain. Transcript variants encoding the same isoform have been identified. | 330 | NA |
| peripheral myelin protein 22 | ENSG00000109099 | PMP22 | This gene encodes an integral membrane protein that is a major component of myelin in the peripheral nervous system. Studies suggest two alternately used promoters drive tissue-specific expression. Various mutations of this gene are causes of Charcot-Marie-Tooth disease Type IA, Dejerine-Sottas syndrome, and hereditary neuropathy with liability to pressure palsies. Alternative splicing results in multiple transcript variants. | 5376 | NA |
| NA | ENSG00000140181 | NA | NA | NA | TRUE |
| albumin | ENSG00000163631 | ALB | Albumin is a soluble, monomeric protein which comprises about one-half of the blood serum protein. Albumin functions primarily as a carrier protein for steroids, fatty acids, and thyroid hormones and plays a role in stabilizing extracellular fluid volume. Albumin is a globular unglycosylated serum protein of molecular weight 65,000. Albumin is synthesized in the liver as preproalbumin which has an N-terminal peptide that is removed before the nascent protein is released from the rough endoplasmic reticulum. The product, proalbumin, is in turn cleaved in the Golgi vesicles to produce the secreted albumin. | 213 | NA |
| NA | ENSG00000259716 | NA | NA | NA | TRUE |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",17,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[18,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| summary | X_id | query | symbol | name | notfound |
|---|---|---|---|---|---|
| The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | 3043 | ENSG00000244734 | HBB | hemoglobin subunit beta | NA |
| The protein encoded by this gene belongs to the glutamine synthetase family. It catalyzes the synthesis of glutamine from glutamate and ammonia in an ATP-dependent reaction. This protein plays a role in ammonia and glutamate detoxification, acid-base homeostasis, cell signaling, and cell proliferation. Glutamine is an abundant amino acid, and is important to the biosynthesis of several amino acids, pyrimidines, and purines. Mutations in this gene are associated with congenital glutamine deficiency, and overexpression of this gene was observed in some primary liver cancer samples. There are six pseudogenes of this gene found on chromosomes 2, 5, 9, 11, and 12. Alternative splicing results in multiple transcript variants. | 2752 | ENSG00000135821 | GLUL | glutamate-ammonia ligase | NA |
| This gene product belongs to the glutathione peroxidase family, which functions in the detoxification of hydrogen peroxide. It contains a selenocysteine (Sec) residue at its active site. The selenocysteine is encoded by the UGA codon, which normally signals translation termination. The 3’ UTR of Sec-containing genes have a common stem-loop structure, the sec insertion sequence (SECIS), which is necessary for the recognition of UGA as a Sec codon rather than as a stop signal. | 2878 | ENSG00000211445 | GPX3 | glutathione peroxidase 3 | NA |
| The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | 3040 | ENSG00000188536 | HBA2 | hemoglobin subunit alpha 2 | NA |
| This gene encodes the anterior pituitary hormone prolactin. This secreted hormone is a growth regulator for many tissues, including cells of the immune system. It may also play a role in cell survival by suppressing apoptosis, and it is essential for lactation. Alternative splicing results in multiple transcript variants that encode the same protein. | 5617 | ENSG00000172179 | PRL | prolactin | NA |
| NA | 79026 | ENSG00000124942 | AHNAK | AHNAK nucleoprotein | NA |
| LPL encodes lipoprotein lipase, which is expressed in heart, muscle, and adipose tissue. LPL functions as a homodimer, and has the dual functions of triglyceride hydrolase and ligand/bridging factor for receptor-mediated lipoprotein uptake. Severe mutations that cause LPL deficiency result in type I hyperlipoproteinemia, while less extreme mutations in LPL are linked to many disorders of lipoprotein metabolism. | 4023 | ENSG00000175445 | LPL | lipoprotein lipase | NA |
| The protein encoded by this gene belongs to the family of latent transforming growth factor (TGF)-beta binding proteins (LTBP), which are extracellular matrix proteins with multi-domain structure. This protein is the largest member of the LTBP family possessing unique regions and with most similarity to the fibrillins. It has thus been suggested that it may have multiple functions: as a member of the TGF-beta latent complex, as a structural component of microfibrils, and a role in cell adhesion. | 4053 | ENSG00000119681 | LTBP2 | latent transforming growth factor beta binding protein 2 | NA |
| The enzyme encoded by this gene is a multifunctional protein. Its main function is to catalyze the synthesis of palmitate from acetyl-CoA and malonyl-CoA, in the presence of NADPH, into long-chain saturated fatty acids. In some cancer cell lines, this protein has been found to be fused with estrogen receptor-alpha (ER-alpha), in which the N-terminus of FAS is fused in-frame with the C-terminus of ER-alpha. | 2194 | ENSG00000169710 | FASN | fatty acid synthase | NA |
| This gene encodes a glycoprotein involved in hemostasis. The encoded preproprotein is proteolytically processed following assembly into large multimeric complexes. These complexes function in the adhesion of platelets to sites of vascular injury and the transport of various proteins in the blood. Mutations in this gene result in von Willebrand disease, an inherited bleeding disorder. An unprocessed pseudogene has been found on chromosome 22. | 7450 | ENSG00000110799 | VWF | von Willebrand factor | NA |
| The protein encoded by this gene is a member of the somatotropin/prolactin family of hormones which play an important role in growth control. The gene, along with four other related genes, is located at the growth hormone locus on chromosome 17 where they are interspersed in the same transcriptional orientation; an arrangement which is thought to have evolved by a series of gene duplications. The five genes share a remarkably high degree of sequence identity. Alternative splicing generates additional isoforms of each of the five growth hormones, leading to further diversity and potential for specialization. This particular family member is expressed in the pituitary but not in placental tissue as is the case for the other four genes in the growth hormone locus. Mutations in or deletions of the gene lead to growth hormone deficiency and short stature. | 2688 | ENSG00000259384 | GH1 | growth hormone 1 | NA |
| Acetyl-CoA carboxylase (ACC) is a complex multifunctional enzyme system. ACC is a biotin-containing enzyme which catalyzes the carboxylation of acetyl-CoA to malonyl-CoA, the rate-limiting step in fatty acid synthesis. ACC-beta is thought to control fatty acid oxidation by means of the ability of malonyl-CoA to inhibit carnitine-palmitoyl-CoA transferase I, the rate-limiting step in fatty acid uptake and oxidation by mitochondria. ACC-beta may be involved in the regulation of fatty acid oxidation, rather than fatty acid biosynthesis. There is evidence for the presence of two ACC-beta isoforms. | 32 | ENSG00000076555 | ACACB | acetyl-CoA carboxylase beta | NA |
| NA | NA | ENSG00000117289 | NA | NA | TRUE |
| Members of the perilipin family, such as PLIN4, coat intracellular lipid storage droplets (Wolins et al., 2003 [PubMed 12840023]). | 729359 | ENSG00000167676 | PLIN4 | perilipin 4 | NA |
| This locus has a highly complex imprinted expression pattern. It gives rise to maternally, paternally, and biallelically expressed transcripts that are derived from four alternative promoters and 5’ exons. Some transcripts contain a differentially methylated region (DMR) at their 5’ exons, and this DMR is commonly found in imprinted genes and correlates with transcript expression. An antisense transcript is produced from an overlapping locus on the opposite strand. One of the transcripts produced from this locus, and the antisense transcript, are paternally expressed noncoding RNAs, and may regulate imprinting in this region. In addition, one of the transcripts contains a second overlapping ORF, which encodes a structurally unrelated protein - Alex. Alternative splicing of downstream exons is also observed, which results in different forms of the stimulatory G-protein alpha subunit, a key element of the classical signal transduction pathway linking receptor-ligand interactions with the activation of adenylyl cyclase and a variety of cellular reponses. Multiple transcript variants encoding different isoforms have been found for this gene. Mutations in this gene result in pseudohypoparathyroidism type 1a, pseudohypoparathyroidism type 1b, Albright hereditary osteodystrophy, pseudopseudohypoparathyroidism, McCune-Albright syndrome, progressive osseus heteroplasia, polyostotic fibrous dysplasia of bone, and some pituitary tumors. | 2778 | ENSG00000087460 | GNAS | GNAS complex locus | NA |
| The protein encoded by this gene is a mechanically-activated ion channel that links mechanical forces to biological signals. The encoded protein contains 36 transmembrane domains and functions as a homotetramer. Defects in this gene have been associated with dehydrated hereditary stomatocytosis. | 9780 | ENSG00000103335 | PIEZO1 | piezo type mechanosensitive ion channel component 1 | NA |
| Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Laminins are composed of 3 non identical chains: laminin alpha, beta and gamma (formerly A, B1, and B2, respectively) and they form a cruciform structure consisting of 3 short arms, each formed by a different chain, and a long arm composed of all 3 chains. Each laminin chain is a multidomain protein encoded by a distinct gene. Several isoforms of each chain have been described. Different alpha, beta and gamma chain isomers combine to give rise to different heterotrimeric laminin isoforms which are designated by Arabic numerals in the order of their discovery, i.e. alpha1beta1gamma1 heterotrimer is laminin 1. The biological functions of the different chains and trimer molecules are largely unknown, but some of the chains have been shown to differ with respect to their tissue distribution, presumably reflecting diverse functions in vivo. This gene encodes the beta chain isoform laminin, beta 1. The beta 1 chain has 7 structurally distinct domains which it shares with other beta chain isomers. The C-terminal helical region containing domains I and II are separated by domain alpha, domains III and V contain several EGF-like repeats, and domains IV and VI have a globular conformation. Laminin, beta 1 is expressed in most tissues that produce basement membranes, and is one of the 3 chains constituting laminin 1, the first laminin isolated from Engelbreth-Holm-Swarm (EHS) tumor. A sequence in the beta 1 chain that is involved in cell attachment, chemotaxis, and binding to the laminin receptor was identified and shown to have the capacity to inhibit metastasis. | 3912 | ENSG00000091136 | LAMB1 | laminin subunit beta 1 | NA |
| This gene encodes a preproprotein that undergoes extensive, tissue-specific, post-translational processing via cleavage by subtilisin-like enzymes known as prohormone convertases. There are eight potential cleavage sites within the preproprotein and, depending on tissue type and the available convertases, processing may yield as many as ten biologically active peptides involved in diverse cellular functions. The encoded protein is synthesized mainly in corticotroph cells of the anterior pituitary where four cleavage sites are used; adrenocorticotrophin, essential for normal steroidogenesis and the maintenance of normal adrenal weight, and lipotropin beta are the major end products. In other tissues, including the hypothalamus, placenta, and epithelium, all cleavage sites may be used, giving rise to peptides with roles in pain and energy homeostasis, melanocyte stimulation, and immune modulation. These include several distinct melanotropins, lipotropins, and endorphins that are contained within the adrenocorticotrophin and beta-lipotropin peptides. The antimicrobial melanotropin alpha peptide exhibits antibacterial and antifungal activity. Mutations in this gene have been associated with early onset obesity, adrenal insufficiency, and red hair pigmentation. Alternatively spliced transcript variants encoding the same protein have been described. | 5443 | ENSG00000115138 | POMC | proopiomelanocortin | NA |
| NA | ENSG00000251322 | ENSG00000251322 | SHANK3 | SH3 and multiple ankyrin repeat domains 3 | NA |
| The protein encoded by this gene is a member of the scavenger receptor cysteine-rich (SRCR) superfamily, and is exclusively expressed in monocytes and macrophages. It functions as an acute phase-regulated receptor involved in the clearance and endocytosis of hemoglobin/haptoglobin complexes by macrophages, and may thereby protect tissues from free hemoglobin-mediated oxidative damage. This protein may also function as an innate immune sensor for bacteria and inducer of local inflammation. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | 9332 | ENSG00000177575 | CD163 | CD163 molecule | NA |
| FABP4 encodes the fatty acid binding protein found in adipocytes. Fatty acid binding proteins are a family of small, highly conserved, cytoplasmic proteins that bind long-chain fatty acids and other hydrophobic ligands. It is thought that FABPs roles include fatty acid uptake, transport, and metabolism. | 2167 | ENSG00000170323 | FABP4 | fatty acid binding protein 4 | NA |
| The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | 3039 | ENSG00000206172 | HBA1 | hemoglobin subunit alpha 1 | NA |
| This gene encodes a member of the nidogen family of basement membrane glycoproteins. The protein interacts with several other components of basement membranes, and may play a role in cell interactions with the extracellular matrix. | 4811 | ENSG00000116962 | NID1 | nidogen 1 | NA |
| This locus may represent a breast cancer candidate gene. It is located close to FGFR1 on a region of chromosome 8 that is amplified in some breast cancers. Three transcript variants encoding different isoforms have been found for this gene. | 6867 | ENSG00000147526 | TACC1 | transforming acidic coiled-coil containing protein 1 | NA |
| This gene belongs to the TIMP gene family. The proteins encoded by this gene family are inhibitors of the matrix metalloproteinases, a group of peptidases involved in degradation of the extracellular matrix (ECM). Expression of this gene is induced in response to mitogenic stimulation and this netrin domain-containing protein is localized to the ECM. Mutations in this gene have been associated with the autosomal dominant disorder Sorsby’s fundus dystrophy. | 7078 | ENSG00000100234 | TIMP3 | TIMP metallopeptidase inhibitor 3 | NA |
| NA | ENSG00000225630 | ENSG00000225630 | MTND2P28 | mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 2 pseudogene 28 | NA |
| This gene encodes a member of the insulin-like growth factor (IGF)-binding protein (IGFBP) family. IGFBPs bind IGFs with high affinity, and regulate IGF availability in body fluids and tissues and modulate IGF binding to its receptors. This protein binds IGF-I and IGF-II with relatively low affinity, and belongs to a subfamily of low-affinity IGFBPs. It also stimulates prostacyclin production and cell adhesion. Alternatively spliced transcript variants encoding different isoforms have been described for this gene, and one variant has been associated with retinal arterial macroaneurysm (PMID:21835307). | 3490 | ENSG00000163453 | IGFBP7 | insulin like growth factor binding protein 7 | NA |
| This gene encodes one of the two alpha chains of type VIII collagen. The gene product is a short chain collagen and a major component of the basement membrane of the corneal endothelium. The type VIII collagen fibril can be either a homo- or a heterotrimer. Alternatively spliced transcript variants encoding the same protein have been observed. | 1295 | ENSG00000144810 | COL8A1 | collagen type VIII alpha 1 | NA |
| This gene encodes a large, transmembrane receptor protein which may function in angiogenesis, lymphocyte homing, cell adhesion, or receptor scavenging. The protein contains 7 fasciclin, 16 epidermal growth factor (EGF)-like, and 2 laminin-type EGF-like domains as well as a C-type lectin-like hyaluronan-binding Link module. The protein is primarily expressed on sinusoidal endothelial cells of liver, spleen, and lymph node. The receptor has been shown to endocytose ligands such as low density lipoprotein, Gram-positive and Gram-negative bacteria, and advanced glycosylation end products. Supporting its possible role as a scavenger receptor, the protein rapidly cycles between the plasma membrane and early endosomes. | 23166 | ENSG00000010327 | STAB1 | stabilin 1 | NA |
| This gene encodes a major constituent of the human complement subcomponent C1q. C1q associates with C1r and C1s in order to yield the first component of the serum complement system. Deficiency of C1q has been associated with lupus erythematosus and glomerulonephritis. C1q is composed of 18 polypeptide chains: six A-chains, six B-chains, and six C-chains. Each chain contains a collagen-like region located near the N terminus and a C-terminal globular region. The A-, B-, and C-chains are arranged in the order A-C-B on chromosome 1. This gene encodes the B-chain polypeptide of human complement subcomponent C1q | 713 | ENSG00000173369 | C1QB | complement component 1, q subcomponent, B chain | NA |
| This gene encodes the beta subunit of prolyl 4-hydroxylase, a highly abundant multifunctional enzyme that belongs to the protein disulfide isomerase family. When present as a tetramer consisting of two alpha and two beta subunits, this enzyme is involved in hydroxylation of prolyl residues in preprocollagen. This enzyme is also a disulfide isomerase containing two thioredoxin domains that catalyze the formation, breakage and rearrangement of disulfide bonds. Other known functions include its ability to act as a chaperone that inhibits aggregation of misfolded proteins in a concentration-dependent manner, its ability to bind thyroid hormone, its role in both the influx and efflux of S-nitrosothiol-bound nitric oxide, and its function as a subunit of the microsomal triglyceride transfer protein complex. | 5034 | ENSG00000185624 | P4HB | prolyl 4-hydroxylase subunit beta | NA |
| This gene encodes a member of the fibulin family of extracellular matrix glycoproteins. Like all members of this family, the encoded protein contains tandemly repeated epidermal growth factor-like repeats followed by a C-terminus fibulin-type domain. This gene is upregulated in malignant gliomas and may play a role in the aggressive nature of these tumors. Mutations in this gene are associated with Doyne honeycomb retinal dystrophy. Alternatively spliced transcript variants that encode the same protein have been described. | 2202 | ENSG00000115380 | EFEMP1 | EGF containing fibulin like extracellular matrix protein 1 | NA |
| This gene encodes a tyrosine-sulfated secretory protein abundant in peptidergic endocrine cells and neurons. This protein may serve as a precursor for regulatory peptides. | 1114 | ENSG00000089199 | CHGB | chromogranin B | NA |
| Myosin, a structural component of muscle, consists of two heavy chains and four light chains. The protein encoded by this gene is a myosin light chain that may regulate muscle contraction by modulating the ATPase activity of myosin heads. The encoded protein binds calcium and is activated by myosin light chain kinase. Two transcript variants encoding different isoforms have been found for this gene. | 10398 | ENSG00000101335 | MYL9 | myosin light chain 9 | NA |
| The protein encoded by this gene is the fourth major glycoprotein of the platelet surface and serves as a receptor for thrombospondin in platelets and various cell lines. Since thrombospondins are widely distributed proteins involved in a variety of adhesive processes, this protein may have important functions as a cell adhesion molecule. It binds to collagen, thrombospondin, anionic phospholipids and oxidized LDL. It directly mediates cytoadherence of Plasmodium falciparum parasitized erythrocytes and it binds long chain fatty acids and may function in the transport and/or as a regulator of fatty acid transport. Mutations in this gene cause platelet glycoprotein deficiency. Multiple alternatively spliced transcript variants have been found for this gene. | 948 | ENSG00000135218 | CD36 | CD36 molecule | NA |
| The protein encoded by this gene associates with class II major histocompatibility complex (MHC) and is an important chaperone that regulates antigen presentation for immune response. It also serves as cell surface receptor for the cytokine macrophage migration inhibitory factor (MIF) which, when bound to the encoded protein, initiates survival pathways and cell proliferation. This protein also interacts with amyloid precursor protein (APP) and suppresses the production of amyloid beta (Abeta). Multiple alternatively spliced transcript variants encoding different isoforms have been identified. | 972 | ENSG00000019582 | CD74 | CD74 molecule | NA |
| This gene encodes a member of the claudin family. Claudins are integral membrane proteins and components of tight junction strands. Tight junction strands serve as a physical barrier to prevent solutes and water from passing freely through the paracellular space between epithelial or endothelial cell sheets. Mutations in this gene have been found in patients with velocardiofacial syndrome. Alternatively spliced transcript variants encoding the same protein have been found for this gene. | 7122 | ENSG00000184113 | CLDN5 | claudin 5 | NA |
| Spectrin is an actin crosslinking and molecular scaffold protein that links the plasma membrane to the actin cytoskeleton, and functions in the determination of cell shape, arrangement of transmembrane proteins, and organization of organelles. It is composed of two antiparallel dimers of alpha- and beta- subunits. This gene is one member of a family of beta-spectrin genes. The encoded protein contains an N-terminal actin-binding domain, and 17 spectrin repeats which are involved in dimer formation. Multiple transcript variants encoding different isoforms have been found for this gene. | 6711 | ENSG00000115306 | SPTBN1 | spectrin beta, non-erythrocytic 1 | NA |
| This gene is a member of the immunoglobulin superfamily. The encoded poly-Ig receptor binds polymeric immunoglobulin molecules at the basolateral surface of epithelial cells; the complex is then transported across the cell to be secreted at the apical surface. A significant association was found between immunoglobulin A nephropathy and several SNPs in this gene. | 5284 | ENSG00000162896 | PIGR | polymeric immunoglobulin receptor | NA |
| The protein encoded by this gene is a member of the phospholipase A2 family (PLA2). PLA2s constitute a diverse family of enzymes with respect to sequence, function, localization, and divalent cation requirements. This gene product belongs to group II, which contains secreted form of PLA2, an extracellular enzyme that has a low molecular mass and requires calcium ions for catalysis. It catalyzes the hydrolysis of the sn-2 fatty acid acyl ester bond of phosphoglycerides, releasing free fatty acids and lysophospholipids, and thought to participate in the regulation of the phospholipid metabolism in biomembranes. Several alternatively spliced transcript variants with different 5’ UTRs have been found for this gene. | 5320 | ENSG00000188257 | PLA2G2A | phospholipase A2 group IIA | NA |
| Amyloid precursor proteins are processed by beta-secretase and gamma-secretase to produce beta-amyloid peptides which form the characteristic plaques of Alzheimer disease. This gene encodes a transmembrane protein which is processed at the C-terminus by furin or furin-like proteases to produce a small secreted peptide which inhibits the deposition of beta-amyloid. Mutations which result in extension of the C-terminal end of the encoded protein, thereby increasing the size of the secreted peptide, are associated with two neurogenerative diseases, familial British dementia and familial Danish dementia. | 9445 | ENSG00000136156 | ITM2B | integral membrane protein 2B | NA |
| The protein encoded by this gene is a glycosylated membrane protein and a non-specific receptor for several chemokines. The encoded protein is the receptor for the human malarial parasites Plasmodium vivax and Plasmodium knowlesi. Polymorphisms in this gene are the basis of the Duffy blood group system. Two transcript variants encoding different isoforms have been found for this gene. | 2532 | ENSG00000213088 | ACKR1 | atypical chemokine receptor 1 (Duffy blood group) | NA |
| This gene encodes a major constituent of the human complement subcomponent C1q. C1q associates with C1r and C1s in order to yield the first component of the serum complement system. A deficiency in C1q has been associated with lupus erythematosus and glomerulonephritis. C1q is composed of 18 polypeptide chains: six A-chains, six B-chains, and six C-chains. Each chain contains a collagen-like region located near the N-terminus, and a C-terminal globular region. The A-, B-, and C-chains are arranged in the order A-C-B on chromosome 1. This gene encodes the C-chain polypeptide of human complement subcomponent C1q. Alternatively spliced transcript variants that encode the same protein have been found for this gene. | 714 | ENSG00000159189 | C1QC | complement component 1, q subcomponent, C chain | NA |
| This gene is a member of the aggrecan/versican proteoglycan family. The protein encoded is a large chondroitin sulfate proteoglycan and is a major component of the extracellular matrix. This protein is involved in cell adhesion, proliferation, proliferation, migration and angiogenesis and plays a central role in tissue morphogenesis and maintenance. Mutations in this gene are the cause of Wagner syndrome type 1. Multiple transcript variants encoding different isoforms have been found for this gene. | 1462 | ENSG00000038427 | VCAN | versican | NA |
| Fibromodulin belongs to the family of small interstitial proteoglycans. The encoded protein possesses a central region containing leucine-rich repeats with 4 keratan sulfate chains, flanked by terminal domains containing disulphide bonds. Owing to the interaction with type I and type II collagen fibrils and in vitro inhibition of fibrillogenesis, the encoded protein may play a role in the assembly of extracellular matrix. It may also regulate TGF-beta activities by sequestering TGF-beta into the extracellular matrix. Sequence variations in this gene may be associated with the pathogenesis of high myopia. Alternative splicing results in multiple transcript variants. | 2331 | ENSG00000122176 | FMOD | fibromodulin | NA |
| The protein encoded by this gene belongs to the family of latent TGF-beta binding proteins (LTBPs). The secretion and activation of TGF-betas is regulated by their association with latency-associated proteins and with latent TGF-beta binding proteins. The product of this gene targets latent complexes of transforming growth factor beta to the extracellular matrix, where the latent cytokine is subsequently activated by several different mechanisms. Alternatively spliced transcript variants encoding different isoforms have been identified. | 4052 | ENSG00000049323 | LTBP1 | latent transforming growth factor beta binding protein 1 | NA |
| This gene encodes a member of the regulators of G protein signaling (RGS) family. The RGS proteins are signal transduction molecules which are involved in the regulation of heterotrimeric G proteins by acting as GTPase activators. This gene is a hypoxia-inducible factor-1 dependent, hypoxia-induced gene which is involved in the induction of endothelial apoptosis. This gene is also one of three genes on chromosome 1q contributing to elevated blood pressure. Alternatively spliced transcript variants have been identified. | 8490 | ENSG00000143248 | RGS5 | regulator of G-protein signaling 5 | NA |
| NA | 4642 | ENSG00000176658 | MYO1D | myosin ID | NA |
| NA | ENSG00000211890 | ENSG00000211890 | IGHA2 | immunoglobulin heavy constant alpha 2 (A2m marker) | NA |
| This gene encodes a member of the unconventional myosin protein family, which are actin-based molecular motors. The protein is found in the cytoplasm, and one isoform with a unique N-terminus is also found in the nucleus. The nuclear isoform associates with RNA polymerase I and II and functions in transcription initiation. The mouse ortholog of this protein also functions in intracellular vesicle transport to the plasma membrane. Multiple transcript variants encoding different isoforms have been found for this gene. The related gene myosin IE has been referred to as myosin IC in the literature, but it is a distinct locus on chromosome 19. | 4641 | ENSG00000197879 | MYO1C | myosin IC | NA |
| The protein encoded by this protein regulates inositol phosphate metabolism by phosphorylation of second messenger inositol 1,4,5-trisphosphate to Ins(1,3,4,5)P4. The activity of this encoded protein is responsible for regulating the levels of a large number of inositol polyphosphates that are important in cellular signaling. Both calcium/calmodulin and protein phosphorylation mechanisms control its activity. | 3707 | ENSG00000143772 | ITPKB | inositol-trisphosphate 3-kinase B | NA |
| This gene encodes a member of the M14 family of metallocarboxypeptidases. The encoded preproprotein is proteolytically processed to generate the mature peptidase. This peripheral membrane protein cleaves C-terminal amino acid residues and is involved in the biosynthesis of peptide hormones and neurotransmitters, including insulin. This protein may also function independently of its peptidase activity, as a neurotrophic factor that promotes neuronal survival, and as a sorting receptor that binds to regulated secretory pathway proteins, including prohormones. Mutations in this gene are implicated in type 2 diabetes. | 1363 | ENSG00000109472 | CPE | carboxypeptidase E | NA |
| The protein encoded by this gene coats lipid storage droplets in adipocytes, thereby protecting them until they can be broken down by hormone-sensitive lipase. The encoded protein is the major cAMP-dependent protein kinase substrate in adipocytes and, when unphosphorylated, may play a role in the inhibition of lipolysis. Alternatively spliced transcript variants varying in the 5’ UTR, but encoding the same protein, have been found for this gene. | 5346 | ENSG00000166819 | PLIN1 | perilipin 1 | NA |
| This gene encodes a major constituent of the human complement subcomponent C1q. C1q associates with C1r and C1s in order to yield the first component of the serum complement system. Deficiency of C1q has been associated with lupus erythematosus and glomerulonephritis. C1q is composed of 18 polypeptide chains: six A-chains, six B-chains, and six C-chains. Each chain contains a collagen-like region located near the N terminus and a C-terminal globular region. The A-, B-, and C-chains are arranged in the order A-C-B on chromosome 1. This gene encodes the A-chain polypeptide of human complement subcomponent C1q. | 712 | ENSG00000173372 | C1QA | complement component 1, q subcomponent, A chain | NA |
| This gene encodes a member of the myosin superfamily. The protein represents a conventional non-muscle myosin; it should not be confused with the unconventional myosin-10 (MYO10). Myosins are actin-dependent motor proteins with diverse functions including regulation of cytokinesis, cell motility, and cell polarity. Mutations in this gene have been associated with May-Hegglin anomaly and developmental defects in brain and heart. Multiple transcript variants encoding different isoforms have been found for this gene. | 4628 | ENSG00000133026 | MYH10 | myosin, heavy chain 10, non-muscle | NA |
| This gene encodes a member of carboxypeptidase A protein family. The encoded protein may function as a transcriptional repressor and play a role in adipogenesis and smooth muscle cell differentiation. Studies in mice suggest that this gene functions in wound healing and abdominal wall development. Overexpression of this gene is associated with glioblastoma. | 165 | ENSG00000106624 | AEBP1 | AE binding protein 1 | NA |
| This gene encodes an integral membrane protein that is a major component of myelin in the peripheral nervous system. Studies suggest two alternately used promoters drive tissue-specific expression. Various mutations of this gene are causes of Charcot-Marie-Tooth disease Type IA, Dejerine-Sottas syndrome, and hereditary neuropathy with liability to pressure palsies. Alternative splicing results in multiple transcript variants. | 5376 | ENSG00000109099 | PMP22 | peripheral myelin protein 22 | NA |
| The protein encoded by this gene has a long and a short form, generated by use of alternative translational start codons. The long form is expressed in steroidogenic tissues such as testis, where it converts cholesteryl esters to free cholesterol for steroid hormone production. The short form is expressed in adipose tissue, among others, where it hydrolyzes stored triglycerides to free fatty acids. | 3991 | ENSG00000079435 | LIPE | lipase E, hormone sensitive type | NA |
| Major alterations in the composition of the cartilage extracellular matrix occur in joint disease, such as osteoarthrosis. This gene encodes the cartilage intermediate layer protein (CILP), which increases in early osteoarthrosis cartilage. The encoded protein was thought to encode a protein precursor for two different proteins; an N-terminal CILP and a C-terminal homolog of NTPPHase, however, later studies identified no nucleotide pyrophosphatase phosphodiesterase (NPP) activity. The full-length and the N-terminal domain of this protein was shown to function as an IGF-1 antagonist. An allelic variant of this gene has been associated with lumbar disc disease. | 8483 | ENSG00000138615 | CILP | cartilage intermediate layer protein | NA |
| The protein encoded by this gene is a small secreted cysteine-rich protein and a member of the CCN family of regulatory proteins. CNN family proteins associate with the extracellular matrix and play an important role in cardiovascular and skeletal development, fibrosis and cancer development. | 4856 | ENSG00000136999 | NOV | nephroblastoma overexpressed | NA |
| NA | 27254 | ENSG00000172346 | CSDC2 | cold shock domain containing C2 | NA |
| This gene encodes a preproprotein that is proteolytically processed to form multiple protein products. The major encoded protein product, lactadherin, is a membrane glycoprotein that promotes phagocytosis of apoptotic cells. This protein has also been implicated in wound healing, autoimmune disease, and cancer. Lactadherin can be further processed to form a smaller cleavage product, medin, which comprises the major protein component of aortic medial amyloid (AMA). Alternative splicing results in multiple transcript variants. | 4240 | ENSG00000140545 | MFGE8 | milk fat globule-EGF factor 8 protein | NA |
| This gene encodes a large protein that contains six ankyrin repeats, as well as a Src homology 3 (SH3) domain and two sterile alpha motif (SAM) domains, which may be involved in protein-protein interactions. The C-terminal portion of this protein is proline-rich and contains a conserved region. A related protein interacts with calcium/calmodulin-dependent serine protein kinase (CASK). Alternative splicing results in multiple transcript variants. | 57513 | ENSG00000177303 | CASKIN2 | CASK interacting protein 2 | NA |
| This gene encodes a type I cell-surface receptor for the TGF-beta superfamily of ligands. It shares with other type I receptors a high degree of similarity in serine-threonine kinase subdomains, a glycine- and serine-rich region (called the GS domain) preceding the kinase domain, and a short C-terminal tail. The encoded protein, sometimes termed ALK1, shares similar domain structures with other closely related ALK or activin receptor-like kinase proteins that form a subfamily of receptor serine/threonine kinases. Mutations in this gene are associated with hemorrhagic telangiectasia type 2, also known as Rendu-Osler-Weber syndrome 2. | 94 | ENSG00000139567 | ACVRL1 | activin A receptor like type 1 | NA |
| This gene encodes an alpha chain for one of the low abundance fibrillar collagens. Fibrillar collagen molecules are trimers that can be composed of one or more types of alpha chains. Type V collagen is found in tissues containing type I collagen and appears to regulate the assembly of heterotypic fibers composed of both type I and type V collagen. This gene product is closely related to type XI collagen and it is possible that the collagen chains of types V and XI constitute a single collagen type with tissue-specific chain combinations. Mutations in this gene are thought to be responsible for the symptoms of a subset of patients with Ehlers-Danlos syndrome type III. Messages of several sizes can be detected in northern blots but sequence information cannot confirm the identity of the shorter messages. | 50509 | ENSG00000080573 | COL5A3 | collagen type V alpha 3 | NA |
| This gene encodes a secreted endothelial cell protein that contains two epidermal growth factor-like domains. The encoded protein may play a role in regulating vasculogenesis. This protein may be involved in the growth and proliferation of tumor cells. Alternate splicing results in multiple transcript variants. | 51162 | ENSG00000172889 | EGFL7 | EGF like domain multiple 7 | NA |
| This gene may play a role in regulation of the innate immune response. The encoded protein is upregulated in response to viral infection and may be involved in mediation of tumor necrosis factor-alpha proinflammatory responses. Mutations in this gene have been associated with Aicardi-Goutieres syndrome. | 25939 | ENSG00000101347 | SAMHD1 | SAM and HD domain containing deoxynucleoside triphosphate triphosphohydrolase 1 | NA |
| This gene encodes an alpha integrin. Integrins are heterodimeric integral membrane proteins composed of an alpha chain and a beta chain. This protein contains an I domain, is expressed in muscle tissue, dimerizes with beta 1 integrin in vitro, and appears to bind collagen in this form. Therefore, the protein may be involved in attaching muscle tissue to the extracellular matrix. Alternative transcriptional splice variants have been found for this gene, but their biological validity is not determined. | 22801 | ENSG00000137809 | ITGA11 | integrin subunit alpha 11 | NA |
| The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. The use of alternate polyadenylation sites has been found for this gene. | 23555 | ENSG00000099282 | TSPAN15 | tetraspanin 15 | NA |
| This gene is a member of the apolipoprotein L gene family, and it is present in a cluster with other family members on chromosome 22. The encoded protein is found in the cytoplasm, where it may affect the movement of lipids, including cholesterol, and/or allow the binding of lipids to organelles. In addition, expression of this gene is up-regulated by tumor necrosis factor-alpha in endothelial cells lining the normal and atherosclerotic iliac artery and aorta. Alternative splicing results in multiple transcript variants. | 80833 | ENSG00000128284 | APOL3 | apolipoprotein L3 | NA |
| Tryptases comprise a family of trypsin-like serine proteases, the peptidase family S1. Tryptases are enzymatically active only as heparin-stabilized tetramers, and they are resistant to all known endogenous proteinase inhibitors. Several tryptase genes are clustered on chromosome 16p13.3. These genes are characterized by several distinct features. They have a highly conserved 3’ UTR and contain tandem repeat sequences at the 5’ flank and 3’ UTR which are thought to play a role in regulation of the mRNA stability. These genes have an intron immediately upstream of the initiator Met codon, which separates the site of transcription initiation from protein coding sequence. This feature is characteristic of tryptases but is unusual in other genes. The alleles of this gene exhibit an unusual amount of sequence variation, such that the alleles were once thought to represent two separate genes, alpha and beta 1. Beta tryptases appear to be the main isoenzymes expressed in mast cells; whereas in basophils, alpha tryptases predominate. Tryptases have been implicated as mediators in the pathogenesis of asthma and other allergic and inflammatory disorders. | 7177 | ENSG00000172236 | TPSAB1 | tryptase alpha/beta 1 | NA |
| This gene encodes a member of the semicarbazide-sensitive amine oxidase family. Copper amine oxidases catalyze the oxidative conversion of amines to aldehydes in the presence of copper and quinone cofactor. The encoded protein is localized to the cell surface, has adhesive properties as well as monoamine oxidase activity, and may be involved in leukocyte trafficking. Alterations in levels of the encoded protein may be associated with many diseases, including diabetes mellitus. A pseudogene of this gene has been described and is located approximately 9-kb downstream on the same chromosome. Alternative splicing results in multiple transcript variants. | 8639 | ENSG00000131471 | AOC3 | amine oxidase, copper containing 3 | NA |
| This gene encodes an enzyme involved in fatty acid biosynthesis, primarily the synthesis of oleic acid. The protein belongs to the fatty acid desaturase family and is an integral membrane protein located in the endoplasmic reticulum. Transcripts of approximately 3.9 and 5.2 kb, differing only by alternative polyadenlyation signals, have been detected. A gene encoding a similar enzyme is located on chromosome 4 and a pseudogene of this gene is located on chromosome 17. | 6319 | ENSG00000099194 | SCD | stearoyl-CoA desaturase | NA |
| The protein encoded by this gene belongs to the thrombospondin protein family. Thrombospondin family members are adhesive glycoproteins that mediate cell-to-cell and cell-to-matrix interactions. This protein forms a pentamer and can bind to heparin and calcium. It is involved in local signaling in the developing and adult nervous system, and it contributes to spinal sensitization and neuropathic pain states. This gene is activated during the stromal response to invasive breast cancer. It may also play a role in inflammatory responses in Alzheimer’s disease. Alternative splicing results in multiple transcript variants. | 7060 | ENSG00000113296 | THBS4 | thrombospondin 4 | NA |
| NA | NA | ENSG00000259716 | NA | NA | TRUE |
| The protein encoded by this gene is a member of the protein tyrosine phosphatase (PTP) family. PTPs are known to be signaling molecules that regulate a variety of cellular processes including cell growth, differentiation, mitotic cycle, and oncogenic transformation. This PTP contains an extracellular domain, a single transmembrane segment and one intracytoplasmic catalytic domain, thus belongs to receptor type PTP. The extracellular region of this PTP is composed of multiple fibronectin type_III repeats, which was shown to interact with neuronal receptor and cell adhesion molecules, such as contactin and tenascin C. This protein was also found to interact with sodium channels, and thus may regulate sodium channels by altering tyrosine phosphorylation status. The functions of the interaction partners of this protein implicate the roles of this PTP in cell adhesion, neurite growth, and neuronal differentiation. Alternate transcript variants encoding different isoforms have been found for this gene. | 5787 | ENSG00000127329 | PTPRB | protein tyrosine phosphatase, receptor type B | NA |
| This gene encodes a member of the cell death-inducing DNA fragmentation factor-like effector family. Members of this family play important roles in apoptosis. The encoded protein promotes lipid droplet formation in adipocytes and may mediate adipocyte apoptosis. This gene is regulated by insulin and its expression is positively correlated with insulin sensitivity. Mutations in this gene may contribute to insulin resistant diabetes. A pseudogene of this gene is located on the short arm of chromosome 3. Alternatively spliced transcript variants that encode different isoforms have been observed for this gene. | 63924 | ENSG00000187288 | CIDEC | cell death inducing DFFA like effector c | NA |
| Phosphatidylinositol 3-kinase phosphorylates the inositol ring of phosphatidylinositol at the 3-prime position. The enzyme comprises a 110 kD catalytic subunit and a regulatory subunit of either 85, 55, or 50 kD. This gene encodes the 85 kD regulatory subunit. Phosphatidylinositol 3-kinase plays an important role in the metabolic actions of insulin, and a mutation in this gene has been associated with insulin resistance. Alternative splicing of this gene results in four transcript variants encoding different isoforms. | 5295 | ENSG00000145675 | PIK3R1 | phosphoinositide-3-kinase regulatory subunit 1 | NA |
| NA | 8503 | ENSG00000117461 | PIK3R3 | phosphoinositide-3-kinase regulatory subunit 3 | NA |
| This gene encodes a member of the intermediate filament family. Intermediate filamentents, along with microtubules and actin microfilaments, make up the cytoskeleton. The protein encoded by this gene is responsible for maintaining cell shape, integrity of the cytoplasm, and stabilizing cytoskeletal interactions. It is also involved in the immune response, and controls the transport of low-density lipoprotein (LDL)-derived cholesterol from a lysosome to the site of esterification. It functions as an organizer of a number of critical proteins involved in attachment, migration, and cell signaling. Mutations in this gene causes a dominant, pulverulent cataract. | 7431 | ENSG00000026025 | VIM | vimentin | NA |
| NA | ENSG00000260121 | ENSG00000260121 | RP5-1142A6.9 | NA | NA |
| NA | 55228 | ENSG00000182013 | PNMAL1 | paraneoplastic Ma antigen family-like 1 | NA |
| NA | 3726 | ENSG00000171223 | JUNB | JunB proto-oncogene, AP-1 transcription factor subunit | NA |
| This gene encodes a weak acid-active hyaluronidase. The encoded protein is similar in structure to other more active hyaluronidases. Hyaluronidases degrade hyaluronan, one of the major glycosaminoglycans of the extracellular matrix. Hyaluronan and fragments of hyaluronan are thought to be involved in cell proliferation, migration and differentiation. Although it was previously thought to be a lysosomal hyaluronidase that is active at a pH below 4, the encoded protein is likely a GPI-anchored cell surface protein. This hyaluronidase serves as a receptor for the oncogenic virus Jaagsiekte sheep retrovirus. The gene is one of several related genes in a region of chromosome 3p21.3 associated with tumor suppression. This gene encodes two alternatively spliced transcript variants which differ only in the 5’ UTR. | 8692 | ENSG00000068001 | HYAL2 | hyaluronoglucosaminidase 2 | NA |
| The protein encoded by this gene is structurally similar to G protein-coupled receptors and is highly expressed in endothelial cells. It binds the ligand sphingosine-1-phosphate with high affinity and high specificity, and suggested to be involved in the processes that regulate the differentiation of endothelial cells. Activation of this receptor induces cell-cell adhesion. Alternative splicing results in multiple transcript variants. | 1901 | ENSG00000170989 | S1PR1 | sphingosine-1-phosphate receptor 1 | NA |
| The protein encoded by this gene is a membrane-bound arginine/lysine carboxypeptidase. Its expression is associated with monocyte to macrophage differentiation. This encoded protein contains hydrophobic regions at the amino and carboxy termini and has 6 potential asparagine-linked glycosylation sites. The active site residues of carboxypeptidases A and B are conserved in this protein. Three alternatively spliced transcript variants encoding the same protein have been described for this gene. | 1368 | ENSG00000135678 | CPM | carboxypeptidase M | NA |
| Members of the CELF/BRUNOL protein family contain two N-terminal RNA recognition motif (RRM) domains, one C-terminal RRM domain, and a divergent segment of 160-230 aa between the second and third RRM domains. Members of this protein family regulate pre-mRNA alternative splicing and may also be involved in mRNA editing, and translation. Alternative splicing results in multiple transcript variants encoding different isoforms. | 10659 | ENSG00000048740 | CELF2 | CUGBP, Elav-like family member 2 | NA |
| The protein encoded by this gene is a member of the immunophilin protein family, which play a role in immunoregulation and basic cellular processes involving protein folding and trafficking. Unlike the other members of the family, this encoded protein does not seem to have PPIase/rotamase activity. It may have a role in neurons associated with memory function. | 23770 | ENSG00000105701 | FKBP8 | FK506 binding protein 8 | NA |
| NA | NA | ENSG00000256545 | NA | NA | TRUE |
| Epidermodysplasia verruciformis (EV) is an autosomal recessive dermatosis characterized by abnormal susceptibility to human papillomaviruses (HPVs) and a high rate of progression to squamous cell carcinoma on sun-exposed skin. EV is caused by mutations in either of two adjacent genes located on chromosome 17q25.3. Both of these genes encode integral membrane proteins that localize to the endoplasmic reticulum and are predicted to form transmembrane channels. This gene encodes a transmembrane channel-like protein with 10 transmembrane domains and 2 leucine zipper motifs. | 11322 | ENSG00000141524 | TMC6 | transmembrane channel like 6 | NA |
| This gene encodes a serine hydrolase of the AB hydrolase superfamily that catalyzes the conversion of monoacylglycerides to free fatty acids and glycerol. The encoded protein plays a critical role in several physiological processes including pain and nociperception through hydrolysis of the endocannabinoid 2-arachidonoylglycerol. Expression of this gene may play a role in cancer tumorigenesis and metastasis. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 11343 | ENSG00000074416 | MGLL | monoglyceride lipase | NA |
| This gene encodes a member of the tyrosine protein kinase family. The encoded protein plays a critical role in angiogenesis and blood vessel stability by inhibiting angiopoietin 1 signaling through the endothelial receptor tyrosine kinase Tie2. Ectodomain cleavage of the encoded protein relieves inhibition of Tie2 and is mediated by multiple factors including vascular endothelial growth factor. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 7075 | ENSG00000066056 | TIE1 | tyrosine kinase with immunoglobulin like and EGF like domains 1 | NA |
| This gene encodes a member of the S1, or chymotrypsin, family of serine peptidases. This protease catalyzes the cleavage of factor B, the rate-limiting step of the alternative pathway of complement activation. This protein also functions as an adipokine, a cell signaling protein secreted by adipocytes, which regulates insulin secretion in mice. Mutations in this gene underlie complement factor D deficiency, which is associated with recurrent bacterial meningitis infections in human patients. Alternative splicing of this gene results in multiple transcript variants. At least one of these variants encodes a preproprotein that is proteolytically processed to generate the mature protease. | 1675 | ENSG00000197766 | CFD | complement factor D | NA |
| This gene encodes ubiquitin, one of the most conserved proteins known. Ubiquitin has a major role in targeting cellular proteins for degradation by the 26S proteosome. It is also involved in the maintenance of chromatin structure, the regulation of gene expression, and the stress response. Ubiquitin is synthesized as a precursor protein consisting of either polyubiquitin chains or a single ubiquitin moiety fused to an unrelated protein. This gene consists of three direct repeats of the ubiquitin coding sequence with no spacer sequence. Consequently, the protein is expressed as a polyubiquitin precursor with a final amino acid after the last repeat. An aberrant form of this protein has been detected in patients with Alzheimer’s disease and Down syndrome. Pseudogenes of this gene are located on chromosomes 1, 2, 13, and 17. Alternative splicing results in multiple transcript variants. | 7314 | ENSG00000170315 | UBB | ubiquitin B | NA |
| This gene encodes a regulatory subunit of protein phosphatase-1 (PP1). PP1 catalyzes reversible protein phosphorylation, which is important in a wide range of cellular activities: neuronal, muscular, RNA splicing, protein synthesis, cell death, and glycogen metabolism, to name just a few. By interacting with different regulatory subunits, PP1 is directed to different parts of the cell, to different substrates, or to respond to extracellular signals. | 5507 | ENSG00000119938 | PPP1R3C | protein phosphatase 1 regulatory subunit 3C | NA |
| The protein encoded by this gene is an intermediate filament (IF) family member. IF proteins are cytoskeletal proteins that confer resistance to mechanical stress and are encoded by a dispersed multigene family. This protein has been found to form a linkage between desmin, which is a subunit of the IF network, and the extracellular matrix, and provides an important structural support in muscle. Two alternatively spliced variants encoding different isoforms have been described for this gene. | 23336 | ENSG00000182253 | SYNM | synemin | NA |
| The protein encoded by this gene is a transmembrane (type I) heparan sulfate proteoglycan and is a member of the syndecan proteoglycan family. The syndecans mediate cell binding, cell signaling, and cytoskeletal organization and syndecan receptors are required for internalization of the HIV-1 tat protein. The syndecan-2 protein functions as an integral membrane protein and participates in cell proliferation, cell migration and cell-matrix interactions via its receptor for extracellular matrix proteins. Altered syndecan-2 expression has been detected in several different tumor types. | 6383 | ENSG00000169439 | SDC2 | syndecan 2 | NA |
| The four human glycoprotein hormones chorionic gonadotropin (CG), luteinizing hormone (LH), follicle stimulating hormone (FSH), and thyroid stimulating hormone (TSH) are dimers consisting of alpha and beta subunits that are associated noncovalently. The alpha subunits of these hormones are identical, however, their beta chains are unique and confer biological specificity. The protein encoded by this gene is the alpha subunit and belongs to the glycoprotein hormones alpha chain family. Two transcript variants encoding different isoforms have been found for this gene. | 1081 | ENSG00000135346 | CGA | glycoprotein hormones, alpha polypeptide | NA |
| This gene encodes a member of the NOTCH family of proteins. Members of this Type I transmembrane protein family share structural characteristics including an extracellular domain consisting of multiple epidermal growth factor-like (EGF) repeats, and an intracellular domain consisting of multiple different domain types. Notch signaling is an evolutionarily conserved intercellular signaling pathway that regulates interactions between physically adjacent cells through binding of Notch family receptors to their cognate ligands. The encoded preproprotein is proteolytically processed in the trans-Golgi network to generate two polypeptide chains that heterodimerize to form the mature cell-surface receptor. This receptor plays a role in the development of numerous cell and tissue types. Mutations in this gene are associated with aortic valve disease, Adams-Oliver syndrome, T-cell acute lymphoblastic leukemia, chronic lymphocytic leukemia, and head and neck squamous cell carcinoma. | 4851 | ENSG00000148400 | NOTCH1 | notch 1 | NA |
| NA | 222166 | ENSG00000180354 | MTURN | maturin, neural progenitor differentiation regulator homolog (Xenopus) | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",18,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[19,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | X_id | name | query | summary | notfound |
|---|---|---|---|---|---|
| NEB | 4703 | nebulin | ENSG00000183091 | This gene encodes nebulin, a giant protein component of the cytoskeletal matrix that coexists with the thick and thin filaments within the sarcomeres of skeletal muscle. In most vertebrates, nebulin accounts for 3 to 4% of the total myofibrillar protein. The encoded protein contains approximately 30-amino acid long modules that can be classified into 7 types and other repeated modules. Protein isoform sizes vary from 600 to 800 kD due to alternative splicing that is tissue-, species-,and developmental stage-specific. Of the 183 exons in the nebulin gene, at least 43 are alternatively spliced, although exons 143 and 144 are not found in the same transcript. Of the several thousand transcript variants predicted for nebulin, the RefSeq Project has decided to create three representative RefSeq records. Mutations in this gene are associated with recessive nemaline myopathy. | NA |
| MYBPC1 | 4604 | myosin binding protein C, slow type | ENSG00000196091 | This gene encodes a member of the myosin-binding protein C family. Myosin-binding protein C family members are myosin-associated proteins found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The encoded protein is the slow skeletal muscle isoform of myosin-binding protein C and plays an important role in muscle contraction by recruiting muscle-type creatine kinase to myosin filaments. Mutations in this gene are associated with distal arthrogryposis type I. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| TPM1 | 7168 | tropomyosin 1 (alpha) | ENSG00000140416 | This gene is a member of the tropomyosin family of highly conserved, widely distributed actin-binding proteins involved in the contractile system of striated and smooth muscles and the cytoskeleton of non-muscle cells. Tropomyosin is composed of two alpha-helical chains arranged as a coiled-coil. It is polymerized end to end along the two grooves of actin filaments and provides stability to the filaments. The encoded protein is one type of alpha helical chain that forms the predominant tropomyosin of striated muscle, where it also functions in association with the troponin complex to regulate the calcium-dependent interaction of actin and myosin during muscle contraction. In smooth muscle and non-muscle cells, alternatively spliced transcript variants encoding a range of isoforms have been described. Mutations in this gene are associated with type 3 familial hypertrophic cardiomyopathy. | NA |
| ACTC1 | 70 | actin, alpha, cardiac muscle 1 | ENSG00000159251 | Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC). | NA |
| TNNT2 | 7139 | troponin T2, cardiac type | ENSG00000118194 | The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. Mutations in this gene have been associated with familial hypertrophic cardiomyopathy as well as with dilated cardiomyopathy. Transcripts for this gene undergo alternative splicing that results in many tissue-specific isoforms, however, the full-length nature of some of these variants has not yet been determined. | NA |
| MYH1 | 4619 | myosin, heavy chain 1, skeletal muscle, adult | ENSG00000109061 | Myosin is a major contractile protein which converts chemical energy into mechanical energy through the hydrolysis of ATP. Myosin is a hexameric protein composed of a pair of myosin heavy chains (MYH) and two pairs of nonidentical light chains. Myosin heavy chains are encoded by a multigene family. In mammals at least 10 different myosin heavy chain (MYH) isoforms have been described from striated, smooth, and nonmuscle cells. These isoforms show expression that is spatially and temporally regulated during development. | NA |
| MYH6 | 4624 | myosin, heavy chain 6, cardiac muscle, alpha | ENSG00000197616 | Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. | NA |
| NPPA | 4878 | natriuretic peptide A | ENSG00000175206 | The protein encoded by this gene belongs to the natriuretic peptide family. Natriuretic peptides are implicated in the control of extracellular fluid volume and electrolyte homeostasis. This protein is synthesized as a large precursor (containing a signal peptide), which is processed to release a peptide from the N-terminus with similarity to vasoactive peptide, cardiodilatin, and another peptide from the C-terminus with natriuretic-diuretic activity. Mutations in this gene have been associated with atrial fibrillation familial type 6. This gene is located adjacent to another member of the natriuretic family of peptides on chromosome 1. | NA |
| MYBPC3 | 4607 | myosin binding protein C, cardiac | ENSG00000134571 | MYBPC3 encodes the cardiac isoform of myosin-binding protein C. Myosin-binding protein C is a myosin-associated protein found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. MYBPC3, the cardiac isoform, is expressed exclussively in heart muscle. Regulatory phosphorylation of the cardiac isoform in vivo by cAMP-dependent protein kinase (PKA) upon adrenergic stimulation may be linked to modulation of cardiac contraction. Mutations in MYBPC3 are one cause of familial hypertrophic cardiomyopathy. | NA |
| TPT1 | 7178 | tumor protein, translationally-controlled 1 | ENSG00000133112 | NA | NA |
| PDK4 | 5166 | pyruvate dehydrogenase kinase 4 | ENSG00000004799 | This gene is a member of the PDK/BCKDK protein kinase family and encodes a mitochondrial protein with a histidine kinase domain. This protein is located in the matrix of the mitrochondria and inhibits the pyruvate dehydrogenase complex by phosphorylating one of its subunits, thereby contributing to the regulation of glucose metabolism. Expression of this gene is regulated by glucocorticoids, retinoic acid and insulin. | NA |
| RYR1 | 6261 | ryanodine receptor 1 | ENSG00000196218 | This gene encodes a ryanodine receptor found in skeletal muscle. The encoded protein functions as a calcium release channel in the sarcoplasmic reticulum but also serves to connect the sarcoplasmic reticulum and transverse tubule. Mutations in this gene are associated with malignant hyperthermia susceptibility, central core disease, and minicore myopathy with external ophthalmoplegia. Alternatively spliced transcripts encoding different isoforms have been described. | NA |
| TNNC2 | 7125 | troponin C2, fast skeletal type | ENSG00000101470 | Troponin (Tn), a key protein complex in the regulation of striated muscle contraction, is composed of 3 subunits. The Tn-I subunit inhibits actomyosin ATPase, the Tn-T subunit binds tropomyosin and Tn-C, while the Tn-C subunit binds calcium and overcomes the inhibitory action of the troponin complex on actin filaments. The protein encoded by this gene is the Tn-C subunit. | NA |
| MYL7 | 58498 | myosin light chain 7 | ENSG00000106631 | NA | NA |
| ATP2A1 | 487 | ATPase sarcoplasmic/endoplasmic reticulum Ca2+ transporting 1 | ENSG00000196296 | This gene encodes one of the SERCA Ca(2+)-ATPases, which are intracellular pumps located in the sarcoplasmic or endoplasmic reticula of muscle cells. This enzyme catalyzes the hydrolysis of ATP coupled with the translocation of calcium from the cytosol to the sarcoplasmic reticulum lumen, and is involved in muscular excitation and contraction. Mutations in this gene cause some autosomal recessive forms of Brody disease, characterized by increasing impairment of muscular relaxation during exercise. Alternative splicing results in three transcript variants encoding different isoforms. | NA |
| MYH2 | 4620 | myosin, heavy chain 2, skeletal muscle, adult | ENSG00000125414 | Myosins are actin-based motor proteins that function in the generation of mechanical force in eukaryotic cells. Muscle myosins are heterohexamers composed of 2 myosin heavy chains and 2 pairs of nonidentical myosin light chains. This gene encodes a member of the class II or conventional myosin heavy chains, and functions in skeletal muscle contraction. This gene is found in a cluster of myosin heavy chain genes on chromosome 17. A mutation in this gene results in inclusion body myopathy-3. Multiple alternatively spliced variants, encoding the same protein, have been identified. | NA |
| DKK3 | 27122 | dickkopf WNT signaling pathway inhibitor 3 | ENSG00000050165 | This gene encodes a protein that is a member of the dickkopf family. The secreted protein contains two cysteine rich regions and is involved in embryonic development through its interactions with the Wnt signaling pathway. The expression of this gene is decreased in a variety of cancer cell lines and it may function as a tumor suppressor gene. Alternative splicing results in multiple transcript variants encoding the same protein. | NA |
| CPA1 | 1357 | carboxypeptidase A1 | ENSG00000091704 | This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. | NA |
| MYH11 | 4629 | myosin, heavy chain 11, smooth muscle | ENSG00000133392 | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | NA |
| NPPA-AS1 | ENSG00000242349 | NPPA antisense RNA 1 | ENSG00000242349 | NA | NA |
| MYBPC2 | 4606 | myosin binding protein C, fast type | ENSG00000086967 | This gene encodes a member of the myosin-binding protein C family. This family includes the fast-, slow- and cardiac-type isoforms, each of which is a myosin-associated protein found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The protein encoded by this locus is referred to as the fast-type isoform. Mutations in the related but distinct genes encoding the slow-type and cardiac-type isoforms have been associated with distal arthrogryposis, type 1 and hypertrophic cardiomyopathy, respectively. | NA |
| CPB1 | 1360 | carboxypeptidase B1 | ENSG00000153002 | Three different procarboxypeptidases A and two different procarboxypeptidases B have been isolated. The B1 and B2 forms differ from each other mainly in isoelectric point. Carboxypeptidase B1 is a highly tissue-specific protein and is a useful serum marker for acute pancreatitis and dysfunction of pancreatic transplants. It is not elevated in pancreatic carcinoma. | NA |
| ANKRD1 | 27063 | ankyrin repeat domain 1 | ENSG00000148677 | The protein encoded by this gene is localized to the nucleus of endothelial cells and is induced by IL-1 and TNF-alpha stimulation. Studies in rat cardiomyocytes suggest that this gene functions as a transcription factor. Interactions between this protein and the sarcomeric proteins myopalladin and titin suggest that it may also be involved in the myofibrillar stretch-sensor system. | NA |
| PRSS1 | 5644 | protease, serine 1 | ENSG00000204983 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | NA |
| BIN1 | 274 | bridging integrator 1 | ENSG00000136717 | This gene encodes several isoforms of a nucleocytoplasmic adaptor protein, one of which was initially identified as a MYC-interacting protein with features of a tumor suppressor. Isoforms that are expressed in the central nervous system may be involved in synaptic vesicle endocytosis and may interact with dynamin, synaptojanin, endophilin, and clathrin. Isoforms that are expressed in muscle and ubiquitously expressed isoforms localize to the cytoplasm and nucleus and activate a caspase-independent apoptotic process. Studies in mouse suggest that this gene plays an important role in cardiac muscle development. Alternate splicing of the gene results in several transcript variants encoding different isoforms. Aberrant splice variants expressed in tumor cell lines have also been described. | NA |
| PYGB | 5834 | phosphorylase, glycogen; brain | ENSG00000100994 | The protein encoded by this gene is a glycogen phosphorylase found predominantly in the brain. The encoded protein forms homodimers which can associate into homotetramers, the enzymatically active form of glycogen phosphorylase. The activity of this enzyme is positively regulated by AMP and negatively regulated by ATP, ADP, and glucose-6-phosphate. This enzyme catalyzes the rate-determining step in glycogen degradation. | NA |
| MYL1 | 4632 | myosin light chain 1 | ENSG00000168530 | Myosin is a hexameric ATPase cellular motor protein. It is composed of two heavy chains, two nonphosphorylatable alkali light chains, and two phosphorylatable regulatory light chains. This gene encodes a myosin alkali light chain expressed in fast skeletal muscle. Two transcript variants have been identified for this gene. | NA |
| FN1 | 2335 | fibronectin 1 | ENSG00000115414 | This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | NA |
| TNNI3 | 7137 | troponin I3, cardiac type | ENSG00000129991 | Troponin I (TnI), along with troponin T (TnT) and troponin C (TnC), is one of 3 subunits that form the troponin complex of the thin filaments of striated muscle. TnI is the inhibitory subunit; blocking actin-myosin interactions and thereby mediating striated muscle relaxation. The TnI subfamily contains three genes: TnI-skeletal-fast-twitch, TnI-skeletal-slow-twitch, and TnI-cardiac. This gene encodes the TnI-cardiac protein and is exclusively expressed in cardiac muscle tissues. Mutations in this gene cause familial hypertrophic cardiomyopathy type 7 (CMH7) and familial restrictive cardiomyopathy (RCM). | NA |
| RP11-290D2.6 | ENSG00000273149 | NA | ENSG00000273149 | NA | NA |
| CRIP2 | 1397 | cysteine rich protein 2 | ENSG00000182809 | This gene encodes a putative transcription factor with two LIM zinc-binding domains. The encoded protein may participate in the differentiation of smooth muscle tissue. Alternative splicing results in multiple transcript variants. | NA |
| NEAT1 | 283131 | nuclear paraspeckle assembly transcript 1 (non-protein coding) | ENSG00000245532 | This gene produces a long non-coding RNA (lncRNA) transcribed from the multiple endocrine neoplasia locus. This lncRNA is retained in the nucleus where it forms the core structural component of the paraspeckle sub-organelles. It may act as a transcriptional regulator for numerous genes, including some genes involved in cancer progression. | NA |
| PNLIP | 5406 | pancreatic lipase | ENSG00000175535 | This gene is a member of the lipase gene family. It encodes a carboxyl esterase that hydrolyzes insoluble, emulsified triglycerides, and is essential for the efficient digestion of dietary fats. This gene is expressed specifically in the pancreas. | NA |
| GP2 | 2813 | glycoprotein 2 | ENSG00000169347 | This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. | NA |
| YBX3 | 8531 | Y-box binding protein 3 | ENSG00000060138 | NA | NA |
| KLHL41 | 10324 | kelch like family member 41 | ENSG00000239474 | This gene is a member of the kelch-like family. The encoded protein contains a BACK domain, a BTB/POZ domain, and 5 Kelch repeats. This protein is thought to function in skeletal muscle development and maintenance. Mutations in this gene have been associated with nemaline myopathy (NM), a rare congenital muscle disorder. | NA |
| ZFAND5 | 7763 | zinc finger AN1-type containing 5 | ENSG00000107372 | NA | NA |
| HBB | 3043 | hemoglobin subunit beta | ENSG00000244734 | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | NA |
| KRT10 | 3858 | keratin 10 | ENSG00000186395 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | NA |
| MYLPF | 29895 | myosin light chain, phosphorylatable, fast skeletal muscle | ENSG00000180209 | NA | NA |
| TNNI2 | 7136 | troponin I2, fast skeletal type | ENSG00000130598 | This gene encodes a fast-twitch skeletal muscle protein, a member of the troponin I gene family, and a component of the troponin complex including troponin T, troponin C and troponin I subunits. The troponin complex, along with tropomyosin, is responsible for the calcium-dependent regulation of striated muscle contraction. Mouse studies show that this component is also present in vascular smooth muscle and may play a role in regulation of smooth muscle function. In addition to muscle tissues, this protein is found in corneal epithelium, cartilage where it is an inhibitor of angiogenesis to inhibit tumor growth and metastasis, and mammary gland where it functions as a co-activator of estrogen receptor-related receptor alpha. This protein also suppresses tumor growth in human ovarian carcinoma. Mutations in this gene cause myopathy and distal arthrogryposis type 2B. Alternatively spliced transcript variants have been found for this gene. | NA |
| CELA3A | 10136 | chymotrypsin like elastase family member 3A | ENSG00000142789 | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. | NA |
| PFKFB3 | 5209 | 6-phosphofructo-2-kinase/fructose-2,6-biphosphatase 3 | ENSG00000170525 | The protein encoded by this gene belongs to a family of bifunctional proteins that are involved in both the synthesis and degradation of fructose-2,6-bisphosphate, a regulatory molecule that controls glycolysis in eukaryotes. The encoded protein has a 6-phosphofructo-2-kinase activity that catalyzes the synthesis of fructose-2,6-bisphosphate (F2,6BP), and a fructose-2,6-biphosphatase activity that catalyzes the degradation of F2,6BP. This protein is required for cell cycle progression and prevention of apoptosis. It functions as a regulator of cyclin-dependent kinase 1, linking glucose metabolism to cell proliferation and survival in tumor cells. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| TNNT3 | 7140 | troponin T3, fast skeletal type | ENSG00000130595 | The binding of Ca(2+) to the trimeric troponin complex initiates the process of muscle contraction. Increased Ca(2+) concentrations produce a conformational change in the troponin complex that is transmitted to tropomyosin dimers situated along actin filaments. The altered conformation permits increased interaction between a myosin head and an actin filament which, ultimately, produces a muscle contraction. The troponin complex has protein subunits C, I, and T. Subunit C binds Ca(2+) and subunit I binds to actin and inhibits actin-myosin interaction. Subunit T binds the troponin complex to the tropomyosin complex and is also required for Ca(2+)-mediated activation of actomyosin ATPase activity. There are 3 different troponin T genes that encode tissue-specific isoforms of subunit T for fast skeletal-, slow skeletal-, and cardiac-muscle. This gene encodes fast skeletal troponin T protein; also known as troponin T type 3. Alternative splicing results in multiple transcript variants encoding additional distinct troponin T type 3 isoforms. A developmentally regulated switch between fetal/neonatal and adult troponin T type 3 isoforms occurs. Additional splice variants have been described but their biological validity has not been established. Mutations in this gene may cause distal arthrogryposis multiplex congenita type 2B (DA2B). | NA |
| HSPB7 | 27129 | heat shock protein family B (small) member 7 | ENSG00000173641 | NA | NA |
| STAC3 | 246329 | SH3 and cysteine rich domain 3 | ENSG00000185482 | The protein encoded by this gene is a component of the excitation-contraction coupling machinery of muscles. This protein is a member of the Stac gene family and contains an N-terminal cysteine-rich domain and two SH3 domains. Mutations in this gene are a cause of Native American myopathy. | NA |
| MYL4 | 4635 | myosin light chain 4 | ENSG00000198336 | Myosin is a hexameric ATPase cellular motor protein. It is composed of two myosin heavy chains, two nonphosphorylatable myosin alkali light chains, and two phosphorylatable myosin regulatory light chains. This gene encodes a myosin alkali light chain that is found in embryonic muscle and adult atria. Two alternatively spliced transcript variants encoding the same protein have been found for this gene. | NA |
| MYL9 | 10398 | myosin light chain 9 | ENSG00000101335 | Myosin, a structural component of muscle, consists of two heavy chains and four light chains. The protein encoded by this gene is a myosin light chain that may regulate muscle contraction by modulating the ATPase activity of myosin heads. The encoded protein binds calcium and is activated by myosin light chain kinase. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| TNNT1 | 7138 | troponin T1, slow skeletal type | ENSG00000105048 | This gene encodes a protein that is a subunit of troponin, which is a regulatory complex located on the thin filament of the sarcomere. This complex regulates striated muscle contraction in response to fluctuations in intracellular calcium concentration. This complex is composed of three subunits: troponin C, which binds calcium, troponin T, which binds tropomyosin, and troponin I, which is an inhibitory subunit. This protein is the slow skeletal troponin T subunit. Mutations in this gene cause nemaline myopathy type 5, also known as Amish nemaline myopathy, a neuromuscular disorder characterized by muscle weakness and rod-shaped, or nemaline, inclusions in skeletal muscle fibers which affects infants, resulting in death due to respiratory insufficiency, usually in the second year. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| CASQ2 | 845 | calsequestrin 2 | ENSG00000118729 | The protein encoded by this gene specifies the cardiac muscle family member of the calsequestrin family. Calsequestrin is localized to the sarcoplasmic reticulum in cardiac and slow skeletal muscle cells. The protein is a calcium binding protein that stores calcium for muscle function. Mutations in this gene cause stress-induced polymorphic ventricular tachycardia, also referred to as catecholaminergic polymorphic ventricular tachycardia 2 (CPVT2), a disease characterized by bidirectional ventricular tachycardia that may lead to cardiac arrest. | NA |
| SLC7A2 | 6542 | solute carrier family 7 member 2 | ENSG00000003989 | The protein encoded by this gene is a cationic amino acid transporter and a member of the APC (amino acid-polyamine-organocation) family of transporters. The encoded membrane protein is responsible for the cellular uptake of arginine, lysine and ornithine. Three transcript variants encoding different isoforms have been found for this gene. | NA |
| NA | NA | NA | ENSG00000259716 | NA | TRUE |
| RBPMS2 | 348093 | RNA binding protein with multiple splicing 2 | ENSG00000166831 | NA | NA |
| CEL | 1056 | carboxyl ester lipase | ENSG00000170835 | The protein encoded by this gene is a glycoprotein secreted from the pancreas into the digestive tract and from the lactating mammary gland into human milk. The physiological role of this protein is in cholesterol and lipid-soluble vitamin ester hydrolysis and absorption. This encoded protein promotes large chylomicron production in the intestine. Also its presence in plasma suggests its interactions with cholesterol and oxidized lipoproteins to modulate the progression of atherosclerosis. In pancreatic tumoral cells, this encoded protein is thought to be sequestrated within the Golgi compartment and is probably not secreted. This gene contains a variable number of tandem repeat (VNTR) polymorphism in the coding region that may influence the function of the encoded protein. | NA |
| CLPS | 1208 | colipase | ENSG00000137392 | The protein encoded by this gene is a cofactor needed by pancreatic lipase for efficient dietary lipid hydrolysis. It binds to the C-terminal, non-catalytic domain of lipase, thereby stabilizing an active conformation and considerably increasing the overall hydrophobic binding site. The gene product allows lipase to anchor noncovalently to the surface of lipid micelles, counteracting the destabilizing influence of intestinal bile salts. This cofactor is only expressed in pancreatic acinar cells, suggesting regulation of expression by tissue-specific elements. Three transcript variants encoding different isoforms have been found for this gene. | NA |
| CTRB2 | 440387 | chymotrypsinogen B2 | ENSG00000168928 | NA | NA |
| KRT1 | 3848 | keratin 1 | ENSG00000167768 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | NA |
| MICAL2 | 9645 | microtubule associated monooxygenase, calponin and LIM domain containing 2 | ENSG00000133816 | NA | NA |
| PYGM | 5837 | phosphorylase, glycogen, muscle | ENSG00000068976 | This gene encodes a muscle enzyme involved in glycogenolysis. Highly similar enzymes encoded by different genes are found in liver and brain. Mutations in this gene are associated with McArdle disease (myophosphorylase deficiency), a glycogen storage disease of muscle. Alternative splicing results in multiple transcript variants. | NA |
| NPPB | 4879 | natriuretic peptide B | ENSG00000120937 | This gene is a member of the natriuretic peptide family and encodes a secreted protein which functions as a cardiac hormone. The protein undergoes two cleavage events, one within the cell and a second after secretion into the blood. The protein’s biological actions include natriuresis, diuresis, vasorelaxation, inhibition of renin and aldosterone secretion, and a key role in cardiovascular homeostasis. A high concentration of this protein in the bloodstream is indicative of heart failure. The protein also acts as an antimicrobial peptide with antibacterial and antifungal activity. Mutations in this gene have been associated with postmenopausal osteoporosis. | NA |
| KIAA0368 | 23392 | KIAA0368 | ENSG00000136813 | NA | NA |
| CA3 | 761 | carbonic anhydrase 3 | ENSG00000164879 | Carbonic anhydrase III (CAIII) is a member of a multigene family (at least six separate genes are known) that encodes carbonic anhydrase isozymes. These carbonic anhydrases are a class of metalloenzymes that catalyze the reversible hydration of carbon dioxide and are differentially expressed in a number of cell types. The expression of the CA3 gene is strictly tissue specific and present at high levels in skeletal muscle and much lower levels in cardiac and smooth muscle. A proportion of carriers of Duchenne muscle dystrophy have a higher CA3 level than normal. The gene spans 10.3 kb and contains seven exons and six introns. | NA |
| CLIC4 | 25932 | chloride intracellular channel 4 | ENSG00000169504 | Chloride channels are a diverse group of proteins that regulate fundamental cellular processes including stabilization of cell membrane potential, transepithelial transport, maintenance of intracellular pH, and regulation of cell volume. Chloride intracellular channel 4 (CLIC4) protein, encoded by the CLIC4 gene, is a member of the p64 family; the gene is expressed in many tissues and exhibits a intracellular vesicular pattern in Panc-1 cells (pancreatic cancer cells). | NA |
| C3 | 718 | complement component 3 | ENSG00000125730 | Complement component C3 plays a central role in the activation of complement system. Its activation is required for both classical and alternative complement activation pathways. The encoded preproprotein is proteolytically processed to generate alpha and beta subunits that form the mature protein, which is then further processed to generate numerous peptide products. The C3a peptide, also known as the C3a anaphylatoxin, modulates inflammation and possesses antimicrobial activity. Mutations in this gene are associated with atypical hemolytic uremic syndrome and age-related macular degeneration in human patients. | NA |
| EIF4B | 1975 | eukaryotic translation initiation factor 4B | ENSG00000063046 | NA | NA |
| CELA3B | 23436 | chymotrypsin like elastase family member 3B | ENSG00000219073 | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3B has little elastolytic activity. Like most of the human elastases, elastase 3B is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3B preferentially cleaves proteins after alanine residues. Elastase 3B may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1, and excretion of this protein in fecal material is frequently used as a measure of pancreatic function in clinical assays. | NA |
| KRT2 | 3849 | keratin 2 | ENSG00000172867 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | NA |
| MKNK2 | 2872 | MAP kinase interacting serine/threonine kinase 2 | ENSG00000099875 | This gene encodes a member of the calcium/calmodulin-dependent protein kinases (CAMK) Ser/Thr protein kinase family, which belongs to the protein kinase superfamily. This protein contains conserved DLG (asp-leu-gly) and ENIL (glu-asn-ile-leu) motifs, and an N-terminal polybasic region which binds importin A and the translation factor scaffold protein eukaryotic initiation factor 4G (eIF4G). This protein is one of the downstream kinases activated by mitogen-activated protein (MAP) kinases. It phosphorylates the eukaryotic initiation factor 4E (eIF4E), thus playing important roles in the initiation of mRNA translation, oncogenic transformation and malignant cell proliferation. In addition to eIF4E, this protein also interacts with von Hippel-Lindau tumor suppressor (VHL), ring-box 1 (Rbx1) and Cullin2 (Cul2), which are all components of the CBC(VHL) ubiquitin ligase E3 complex. Multiple alternatively spliced transcript variants have been found, but the full-length nature and biological activity of only two variants are determined. These two variants encode distinct isoforms which differ in activity and regulation, and in subcellular localization. | NA |
| NEBL | 10529 | nebulette | ENSG00000078114 | This gene encodes a nebulin like protein that is abundantly expressed in cardiac muscle. The encoded protein binds actin and interacts with thin filaments and Z-line associated proteins in striated muscle. This protein may be involved in cardiac myofibril assembly. A shorter isoform of this protein termed LIM nebulette is expressed in non-muscle cells and may function as a component of focal adhesion complexes. Alternate splicing results in multiple transcript variants. | NA |
| PABPC4 | 8761 | poly(A) binding protein cytoplasmic 4 | ENSG00000090621 | Poly(A)-binding proteins (PABPs) bind to the poly(A) tail present at the 3-prime ends of most eukaryotic mRNAs. PABPC4 or IPABP (inducible PABP) was isolated as an activation-induced T-cell mRNA encoding a protein. Activation of T cells increased PABPC4 mRNA levels in T cells approximately 5-fold. PABPC4 contains 4 RNA-binding domains and proline-rich C terminus. PABPC4 is localized primarily to the cytoplasm. It is suggested that PABPC4 might be necessary for regulation of stability of labile mRNA species in activated T cells. PABPC4 was also identified as an antigen, APP1 (activated-platelet protein-1), expressed on thrombin-activated rabbit platelets. PABPC4 may also be involved in the regulation of protein translation in platelets and megakaryocytes or may participate in the binding or stabilization of polyadenylates in platelet dense granules. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| CTRB1 | 1504 | chymotrypsinogen B1 | ENSG00000168925 | The protein encoded by this gene is one of a family of serine proteases that is secreted into the gastrointestinal tract as an inactive precursor, which is activated by proteolytic cleavage with trypsin. | NA |
| POPDC2 | 64091 | popeye domain containing 2 | ENSG00000121577 | This gene encodes a member of the POP family of proteins which contain three putative transmembrane domains. This membrane associated protein is predominantly expressed in skeletal and cardiac muscle, and may have an important function in these tissues. | NA |
| FOXO1 | 2308 | forkhead box O1 | ENSG00000150907 | This gene belongs to the forkhead family of transcription factors which are characterized by a distinct forkhead domain. The specific function of this gene has not yet been determined; however, it may play a role in myogenic growth and differentiation. Translocation of this gene with PAX3 has been associated with alveolar rhabdomyosarcoma. | NA |
| ACTB | 60 | actin, beta | ENSG00000075624 | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | NA |
| MYADM | 91663 | myeloid-associated differentiation marker | ENSG00000179820 | NA | NA |
| SLC25A4 | 291 | solute carrier family 25 member 4 | ENSG00000151729 | This gene is a member of the mitochondrial carrier subfamily of solute carrier protein genes. The product of this gene functions as a gated pore that translocates ADP from the cytoplasm into the mitochondrial matrix and ATP from the mitochondrial matrix into the cytoplasm. The protein forms a homodimer embedded in the inner mitochondria membrane. Mutations in this gene have been shown to result in autosomal dominant progressive external opthalmoplegia and familial hypertrophic cardiomyopathy. | NA |
| VIM | 7431 | vimentin | ENSG00000026025 | This gene encodes a member of the intermediate filament family. Intermediate filamentents, along with microtubules and actin microfilaments, make up the cytoskeleton. The protein encoded by this gene is responsible for maintaining cell shape, integrity of the cytoplasm, and stabilizing cytoskeletal interactions. It is also involved in the immune response, and controls the transport of low-density lipoprotein (LDL)-derived cholesterol from a lysosome to the site of esterification. It functions as an organizer of a number of critical proteins involved in attachment, migration, and cell signaling. Mutations in this gene causes a dominant, pulverulent cataract. | NA |
| TNC | 3371 | tenascin C | ENSG00000041982 | This gene encodes an extracellular matrix protein with a spatially and temporally restricted tissue distribution. This protein is homohexameric with disulfide-linked subunits, and contains multiple EGF-like and fibronectin type-III domains. It is implicated in guidance of migrating neurons as well as axons during development, synaptic plasticity, and neuronal regeneration. | NA |
| CPA2 | 1358 | carboxypeptidase A2 | ENSG00000158516 | Three different forms of human pancreatic procarboxypeptidase A have been isolated. The encoded protein represents the A2 form, which is a monomeric protein with different biochemical properties from the A1 and A3 forms. The A2 form of pancreatic procarboxypeptidase acts on aromatic C-terminal residues and is a secreted protein. | NA |
| LAMA5 | 3911 | laminin subunit alpha 5 | ENSG00000130702 | This gene encodes one of the vertebrate laminin alpha chains. Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Laminins are composed of 3 non identical chains: laminin alpha, beta and gamma (formerly A, B1, and B2, respectively) and they form a cruciform structure consisting of 3 short arms, each formed by a different chain, and a long arm composed of all 3 chains. Each laminin chain is a multidomain protein encoded by a distinct gene. The protein encoded by this gene is the alpha-5 subunit of of laminin-10 (laminin-511), laminin-11 (laminin-521) and laminin-15 (laminin-523). | NA |
| MXRA7 | 439921 | matrix remodeling associated 7 | ENSG00000182534 | NA | NA |
| JPH2 | 57158 | junctophilin 2 | ENSG00000149596 | Junctional complexes between the plasma membrane and endoplasmic/sarcoplasmic reticulum are a common feature of all excitable cell types and mediate cross talk between cell surface and intracellular ion channels. The protein encoded by this gene is a component of junctional complexes and is composed of a C-terminal hydrophobic segment spanning the endoplasmic/sarcoplasmic reticulum membrane and a remaining cytoplasmic domain that shows specific affinity for the plasma membrane. This gene is a member of the junctophilin gene family. Alternative splicing has been observed at this locus and two variants encoding distinct isoforms are described. | NA |
| PLA2G1B | 5319 | phospholipase A2 group IB | ENSG00000170890 | This gene encodes a secreted member of the phospholipase A2 (PLA2) class of enzymes, which is produced by the pancreatic acinar cells. The encoded calcium-dependent enzyme catalyzes the hydrolysis of the sn-2 position of membrane glycerophospholipids to release arachidonic acid (AA) and lysophospholipids. AA is subsequently converted by downstream metabolic enzymes to several bioactive lipophilic compounds (eicosanoids), including prostaglandins (PGs) and leukotrienes (LTs). The enzyme may be involved in several physiological processes including cell contraction, cell proliferation and pathological response. | NA |
| DEPTOR | 64798 | DEP domain containing MTOR-interacting protein | ENSG00000155792 | NA | NA |
| AMY2A | 279 | amylase, alpha 2A (pancreatic) | ENSG00000243480 | This gene encodes a member of the alpha-amylase family of proteins. Amylases are secreted proteins that hydrolyze 1,4-alpha-glucoside bonds in oligosaccharides and polysaccharides, catalyzing the first step in digestion of dietary starch and glycogen. This gene and several family members are present in a gene cluster on chromosome 1. This gene encodes an amylase isoenzyme produced by the pancreas. | NA |
| PPP1R27 | 116729 | protein phosphatase 1 regulatory subunit 27 | ENSG00000182676 | NA | NA |
| KIAA1217 | 56243 | KIAA1217 | ENSG00000120549 | NA | NA |
| ABLIM2 | 84448 | actin binding LIM protein family member 2 | ENSG00000163995 | NA | NA |
| LPIN1 | 23175 | lipin 1 | ENSG00000134324 | This gene encodes a magnesium-ion-dependent phosphatidic acid phosphohydrolase enzyme that catalyzes the penultimate step in triglyceride synthesis including the dephosphorylation of phosphatidic acid to yield diacylglycerol. Expression of this gene is required for adipocyte differentiation and it also functions as a nuclear transcriptional coactivator with some peroxisome proliferator-activated receptors to modulate expression of other genes involved in lipid metabolism. Mutations in this gene are associated with metabolic syndrome, type 2 diabetes, and autosomal recessive acute recurrent myoglobinuria (ARARM). This gene is also a candidate for several human lipodystrophy syndromes. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Additional splice variants have been described but their full-length structures have not been determined. | NA |
| HECTD1 | 25831 | HECT domain E3 ubiquitin protein ligase 1 | ENSG00000092148 | NA | NA |
| ADCK3 | 56997 | aarF domain containing kinase 3 | ENSG00000163050 | This gene encodes a mitochondrial protein similar to yeast ABC1, which functions in an electron-transferring membrane protein complex in the respiratory chain. It is not related to the family of ABC transporter proteins. Expression of this gene is induced by the tumor suppressor p53 and in response to DNA damage, and inhibiting its expression partially suppresses p53-induced apoptosis. Alternatively spliced transcript variants have been found; however, their full-length nature has not been determined. | NA |
| FABP4 | 2167 | fatty acid binding protein 4 | ENSG00000170323 | FABP4 encodes the fatty acid binding protein found in adipocytes. Fatty acid binding proteins are a family of small, highly conserved, cytoplasmic proteins that bind long-chain fatty acids and other hydrophobic ligands. It is thought that FABPs roles include fatty acid uptake, transport, and metabolism. | NA |
| LOC100507537 | 100507537 | uncharacterized LOC100507537 | ENSG00000240045 | NA | NA |
| FAM46B | 115572 | family with sequence similarity 46 member B | ENSG00000158246 | NA | NA |
| PPP1R12C | 54776 | protein phosphatase 1 regulatory subunit 12C | ENSG00000125503 | The gene encodes a subunit of myosin phosphatase. The encoded protein regulates the catalytic activity of protein phosphatase 1 delta and assembly of the actin cytoskeleton. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| PRKAG2 | 51422 | protein kinase AMP-activated non-catalytic subunit gamma 2 | ENSG00000106617 | AMP-activated protein kinase (AMPK) is a heterotrimeric protein composed of a catalytic alpha subunit, a noncatalytic beta subunit, and a noncatalytic regulatory gamma subunit. Various forms of each of these subunits exist, encoded by different genes. AMPK is an important energy-sensing enzyme that monitors cellular energy status and functions by inactivating key enzymes involved in regulating de novo biosynthesis of fatty acid and cholesterol. This gene is a member of the AMPK gamma subunit family. Mutations in this gene have been associated with Wolff-Parkinson-White syndrome, familial hypertrophic cardiomyopathy, and glycogen storage disease of the heart. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. | NA |
| COL6A2 | 1292 | collagen type VI alpha 2 | ENSG00000142173 | This gene encodes one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The product of this gene contains several domains similar to von Willebrand Factor type A domains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in this gene are associated with Bethlem myopathy and Ullrich scleroatonic muscular dystrophy. Three transcript variants have been identified for this gene. | NA |
| UCP3 | 7352 | uncoupling protein 3 | ENSG00000175564 | Mitochondrial uncoupling proteins (UCP) are members of the larger family of mitochondrial anion carrier proteins (MACP). UCPs separate oxidative phosphorylation from ATP synthesis with energy dissipated as heat, also referred to as the mitochondrial proton leak. UCPs facilitate the transfer of anions from the inner to the outer mitochondrial membrane and the return transfer of protons from the outer to the inner mitochondrial membrane. They also reduce the mitochondrial membrane potential in mammalian cells. The different UCPs have tissue-specific expression; this gene is primarily expressed in skeletal muscle. This gene’s protein product is postulated to protect mitochondria against lipid-induced oxidative stress. Expression levels of this gene increase when fatty acid supplies to mitochondria exceed their oxidation capacity and the protein enables the export of fatty acids from mitochondria. UCPs contain the three solcar protein domains typically found in MACPs. Two splice variants have been found for this gene. | NA |
| DHRS7 | 51635 | dehydrogenase/reductase 7 | ENSG00000100612 | This gene encodes a member of the short-chain dehydrogenases/reductases (SDR) family, which has over 46,000 members. Members in this family are enzymes that metabolize many different compounds, such as steroid hormones, prostaglandins, retinoids, lipids and xenobiotics. | NA |
| VASP | 7408 | vasodilator-stimulated phosphoprotein | ENSG00000125753 | Vasodilator-stimulated phosphoprotein (VASP) is a member of the Ena-VASP protein family. Ena-VASP family members contain an EHV1 N-terminal domain that binds proteins containing E/DFPPPPXD/E motifs and targets Ena-VASP proteins to focal adhesions. In the mid-region of the protein, family members have a proline-rich domain that binds SH3 and WW domain-containing proteins. Their C-terminal EVH2 domain mediates tetramerization and binds both G and F actin. VASP is associated with filamentous actin formation and likely plays a widespread role in cell adhesion and motility. VASP may also be involved in the intracellular signaling pathways that regulate integrin-extracellular matrix interactions. VASP is regulated by the cyclic nucleotide-dependent kinases PKA and PKG. | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",19,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[20,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | X_id | summary | query | name | notfound |
|---|---|---|---|---|---|
| HBB | 3043 | The alpha (HBA) and beta (HBB) loci determine the structure of the 2 types of polypeptide chains in adult hemoglobin, Hb A. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. Mutant beta globin causes sickle cell anemia. Absence of beta chain causes beta-zero-thalassemia. Reduced amounts of detectable beta globin causes beta-plus-thalassemia. The order of the genes in the beta-globin cluster is 5’-epsilon – gamma-G – gamma-A – delta – beta–3’. | ENSG00000244734 | hemoglobin subunit beta | NA |
| CYP17A1 | 1586 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum. It has both 17alpha-hydroxylase and 17,20-lyase activities and is a key enzyme in the steroidogenic pathway that produces progestins, mineralocorticoids, glucocorticoids, androgens, and estrogens. Mutations in this gene are associated with isolated steroid-17 alpha-hydroxylase deficiency, 17-alpha-hydroxylase/17,20-lyase deficiency, pseudohermaphroditism, and adrenal hyperplasia. | ENSG00000148795 | cytochrome P450 family 17 subfamily A member 1 | NA |
| CYP11B1 | 1584 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the mitochondrial inner membrane and is involved in the conversion of progesterone to cortisol in the adrenal cortex. Mutations in this gene cause congenital adrenal hyperplasia due to 11-beta-hydroxylase deficiency. Transcript variants encoding different isoforms have been noted for this gene. | ENSG00000160882 | cytochrome P450 family 11 subfamily B member 1 | NA |
| PGC | 5225 | This gene encodes an aspartic proteinase that belongs to the peptidase family A1. The encoded protein is a digestive enzyme that is produced in the stomach and constitutes a major component of the gastric mucosa. This protein is also secreted into the serum. This protein is synthesized as an inactive zymogen that includes a highly basic prosegment. This enzyme is converted into its active mature form at low pH by sequential cleavage of the prosegment that is carried out by the enzyme itself. Polymorphisms in this gene are associated with susceptibility to gastric cancers. Serum levels of this enzyme are used as a biomarker for certain gastric diseases including Helicobacter pylori related gastritis. Alternate splicing results in multiple transcript variants. A pseudogene of this gene is found on chromosome 1. | ENSG00000096088 | progastricsin | NA |
| MBP | 4155 | The protein encoded by the classic MBP gene is a major constituent of the myelin sheath of oligodendrocytes and Schwann cells in the nervous system. However, MBP-related transcripts are also present in the bone marrow and the immune system. These mRNAs arise from the long MBP gene (otherwise called ‘Golli-MBP’) that contains 3 additional exons located upstream of the classic MBP exons. Alternative splicing from the Golli and the MBP transcription start sites gives rise to 2 sets of MBP-related transcripts and gene products. The Golli mRNAs contain 3 exons unique to Golli-MBP, spliced in-frame to 1 or more MBP exons. They encode hybrid proteins that have N-terminal Golli aa sequence linked to MBP aa sequence. The second family of transcripts contain only MBP exons and produce the well characterized myelin basic proteins. This complex gene structure is conserved among species suggesting that the MBP transcription unit is an integral part of the Golli transcription unit and that this arrangement is important for the function and/or regulation of these genes. | ENSG00000197971 | myelin basic protein | NA |
| DES | 1674 | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | ENSG00000175084 | desmin | NA |
| HSPD1 | 3329 | This gene encodes a member of the chaperonin family. The encoded mitochondrial protein may function as a signaling molecule in the innate immune system. This protein is essential for the folding and assembly of newly imported proteins in the mitochondria. This gene is adjacent to a related family member and the region between the 2 genes functions as a bidirectional promoter. Several pseudogenes have been associated with this gene. Two transcript variants encoding the same protein have been identified for this gene. Mutations associated with this gene cause autosomal recessive spastic paraplegia 13. | ENSG00000144381 | heat shock protein family D (Hsp60) member 1 | NA |
| HBA2 | 3040 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | ENSG00000188536 | hemoglobin subunit alpha 2 | NA |
| TPM2 | 7169 | This gene encodes beta-tropomyosin, a member of the actin filament binding protein family, and mainly expressed in slow, type 1 muscle fibers. Mutations in this gene can alter the expression of other sarcomeric tropomyosin proteins, and cause cap disease, nemaline myopathy and distal arthrogryposis syndromes. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ENSG00000198467 | tropomyosin 2 (beta) | NA |
| AKR1B1 | 231 | This gene encodes a member of the aldo/keto reductase superfamily, which consists of more than 40 known enzymes and proteins. This member catalyzes the reduction of a number of aldehydes, including the aldehyde form of glucose, and is thereby implicated in the development of diabetic complications by catalyzing the reduction of glucose to sorbitol. Multiple pseudogenes have been identified for this gene. The nomenclature system used by the HUGO Gene Nomenclature Committee to define human aldo-keto reductase family members is known to differ from that used by the Mouse Genome Informatics database. | ENSG00000085662 | aldo-keto reductase family 1 member B | NA |
| STAR | 6770 | The protein encoded by this gene plays a key role in the acute regulation of steroid hormone synthesis by enhancing the conversion of cholesterol into pregnenolone. This protein permits the cleavage of cholesterol into pregnenolone by mediating the transport of cholesterol from the outer mitochondrial membrane to the inner mitochondrial membrane. Mutations in this gene are a cause of congenital lipoid adrenal hyperplasia (CLAH), also called lipoid CAH. A pseudogene of this gene is located on chromosome 13. | ENSG00000147465 | steroidogenic acute regulatory protein | NA |
| LIPF | 8513 | This gene encodes gastric lipase, an enzyme involved in the digestion of dietary triglycerides in the gastrointestinal tract, and responsible for 30% of fat digestion processes occurring in human. It is secreted by gastric chief cells in the fundic mucosa of the stomach, and it hydrolyzes the ester bonds of triglycerides under acidic pH conditions. The gene is a member of a conserved gene family of lipases that play distinct roles in neutral lipid metabolism. Several transcript variants encoding different isoforms have been found for this gene. | ENSG00000182333 | lipase F, gastric type | NA |
| TXNRD1 | 7296 | This gene encodes a member of the family of pyridine nucleotide oxidoreductases. This protein reduces thioredoxins as well as other substrates, and plays a role in selenium metabolism and protection against oxidative stress. The functional enzyme is thought to be a homodimer which uses FAD as a cofactor. Each subunit contains a selenocysteine (Sec) residue which is required for catalytic activity. The selenocysteine is encoded by the UGA codon that normally signals translation termination. The 3’ UTR of selenocysteine-containing genes have a common stem-loop structure, the sec insertion sequence (SECIS), that is necessary for the recognition of UGA as a Sec codon rather than as a stop signal. Alternative splicing results in several transcript variants encoding the same or different isoforms. | ENSG00000198431 | thioredoxin reductase 1 | NA |
| NA | NA | NA | ENSG00000090920 | NA | TRUE |
| ATP2B4 | 493 | The protein encoded by this gene belongs to the family of P-type primary ion transport ATPases characterized by the formation of an aspartyl phosphate intermediate during the reaction cycle. These enzymes remove bivalent calcium ions from eukaryotic cells against very large concentration gradients and play a critical role in intracellular calcium homeostasis. The mammalian plasma membrane calcium ATPase isoforms are encoded by at least four separate genes and the diversity of these enzymes is further increased by alternative splicing of transcripts. The expression of different isoforms and splice variants is regulated in a developmental, tissue- and cell type-specific manner, suggesting that these pumps are functionally adapted to the physiological needs of particular cells and tissues. This gene encodes the plasma membrane calcium ATPase isoform 4. Alternatively spliced transcript variants encoding different isoforms have been identified. | ENSG00000058668 | ATPase plasma membrane Ca2+ transporting 4 | NA |
| CYP21A2 | 1589 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum and hydroxylates steroids at the 21 position. Its activity is required for the synthesis of steroid hormones including cortisol and aldosterone. Mutations in this gene cause congenital adrenal hyperplasia. A related pseudogene is located near this gene; gene conversion events involving the functional gene and the pseudogene are thought to account for many cases of steroid 21-hydroxylase deficiency. Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000231852 | cytochrome P450 family 21 subfamily A member 2 | NA |
| FOSL2 | 2355 | The Fos gene family consists of 4 members: FOS, FOSB, FOSL1, and FOSL2. These genes encode leucine zipper proteins that can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1. As such, the FOS proteins have been implicated as regulators of cell proliferation, differentiation, and transformation. | ENSG00000075426 | FOS like 2, AP-1 transcription factor subunit | NA |
| HSPA8 | 3312 | This gene encodes a member of the heat shock protein 70 family, which contains both heat-inducible and constitutively expressed members. This protein belongs to the latter group, which are also referred to as heat-shock cognate proteins. It functions as a chaperone, and binds to nascent polypeptides to facilitate correct folding. It also functions as an ATPase in the disassembly of clathrin-coated vesicles during transport of membrane components through the cell. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ENSG00000109971 | heat shock protein family A (Hsp70) member 8 | NA |
| SLC44A2 | 57153 | NA | ENSG00000129353 | solute carrier family 44 member 2 | NA |
| MYL9 | 10398 | Myosin, a structural component of muscle, consists of two heavy chains and four light chains. The protein encoded by this gene is a myosin light chain that may regulate muscle contraction by modulating the ATPase activity of myosin heads. The encoded protein binds calcium and is activated by myosin light chain kinase. Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000101335 | myosin light chain 9 | NA |
| KRT10 | 3858 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | ENSG00000186395 | keratin 10 | NA |
| HSP90AA1 | 3320 | The protein encoded by this gene is an inducible molecular chaperone that functions as a homodimer. The encoded protein aids in the proper folding of specific target proteins by use of an ATPase activity that is modulated by co-chaperones. Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000080824 | heat shock protein 90kDa alpha family class A member 1 | NA |
| STAT3 | 6774 | The protein encoded by this gene is a member of the STAT protein family. In response to cytokines and growth factors, STAT family members are phosphorylated by the receptor associated kinases, and then form homo- or heterodimers that translocate to the cell nucleus where they act as transcription activators. This protein is activated through phosphorylation in response to various cytokines and growth factors including IFNs, EGF, IL5, IL6, HGF, LIF and BMP2. This protein mediates the expression of a variety of genes in response to cell stimuli, and thus plays a key role in many cellular processes such as cell growth and apoptosis. The small GTPase Rac1 has been shown to bind and regulate the activity of this protein. PIAS3 protein is a specific inhibitor of this protein. Mutations in this gene are associated with infantile-onset multisystem autoimmune disease and hyper-immunoglobulin E syndrome. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | ENSG00000168610 | signal transducer and activator of transcription 3 | NA |
| CYP11A1 | 1583 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the mitochondrial inner membrane and catalyzes the conversion of cholesterol to pregnenolone, the first and rate-limiting step in the synthesis of the steroid hormones. Two transcript variants encoding different isoforms have been found for this gene. The cellular location of the smaller isoform is unclear since it lacks the mitochondrial-targeting transit peptide. | ENSG00000140459 | cytochrome P450 family 11 subfamily A member 1 | NA |
| RP11-862L9.3 | ENSG00000266844 | NA | ENSG00000266844 | NA | NA |
| ALAS1 | 211 | This gene encodes the mitochondrial enzyme which is catalyzes the rate-limiting step in heme (iron-protoporphyrin) biosynthesis. The enzyme encoded by this gene is the housekeeping enzyme; a separate gene encodes a form of the enzyme that is specific for erythroid tissue. The level of the mature encoded protein is regulated by heme: high levels of heme down-regulate the mature enzyme in mitochondria while low heme levels up-regulate. A pseudogene of this gene is located on chromosome 12. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000023330 | 5’-aminolevulinate synthase 1 | NA |
| C7 | 730 | C7 is a component of the complement system. It participates in the formation of Membrane Attack Complex (MAC). People with C7 deficiency are prone to bacterial infection. | ENSG00000112936 | complement component 7 | NA |
| NA | NA | NA | ENSG00000259716 | NA | TRUE |
| KRT19 | 3880 | The protein encoded by this gene is a member of the keratin family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. The type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. Unlike its related family members, this smallest known acidic cytokeratin is not paired with a basic cytokeratin in epithelial cells. It is specifically expressed in the periderm, the transiently superficial layer that envelopes the developing epidermis. The type I cytokeratins are clustered in a region of chromosome 17q12-q21. | ENSG00000171345 | keratin 19 | NA |
| STIP1 | 10963 | STIP1 is an adaptor protein that coordinates the functions of HSP70 (see HSPA1A; MIM 140550) and HSP90 (see HSP90AA1; MIM 140571) in protein folding. It is thought to assist in the transfer of proteins from HSP70 to HSP90 by binding both HSP90 and substrate-bound HSP70. STIP1 also stimulates the ATPase activity of HSP70 and inhibits the ATPase activity of HSP90, suggesting that it regulates both the conformations and ATPase cycles of these chaperones (Song and Masison, 2005 [PubMed 16100115]). | ENSG00000168439 | stress induced phosphoprotein 1 | NA |
| THBS1 | 7057 | The protein encoded by this gene is a subunit of a disulfide-linked homotrimeric protein. This protein is an adhesive glycoprotein that mediates cell-to-cell and cell-to-matrix interactions. This protein can bind to fibrinogen, fibronectin, laminin, type V collagen and integrins alpha-V/beta-1. This protein has been shown to play roles in platelet aggregation, angiogenesis, and tumorigenesis. | ENSG00000137801 | thrombospondin 1 | NA |
| GAPDH | 2597 | This gene encodes a member of the glyceraldehyde-3-phosphate dehydrogenase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. The product of this gene catalyzes an important energy-yielding step in carbohydrate metabolism, the reversible oxidative phosphorylation of glyceraldehyde-3-phosphate in the presence of inorganic phosphate and nicotinamide adenine dinucleotide (NAD). The encoded protein has additionally been identified to have uracil DNA glycosylase activity in the nucleus. Also, this protein contains a peptide that has antimicrobial activity against E. coli, P. aeruginosa, and C. albicans. Studies of a similar protein in mouse have assigned a variety of additional functions including nitrosylation of nuclear proteins, the regulation of mRNA stability, and acting as a transferrin receptor on the cell surface of macrophage. Many pseudogenes similar to this locus are present in the human genome. Alternative splicing results in multiple transcript variants. | ENSG00000111640 | glyceraldehyde-3-phosphate dehydrogenase | NA |
| PGA3 | 643834 | This gene encodes a protein precursor of the digestive enzyme pepsin, a member of the peptidase A1 family of endopeptidases. The encoded precursor is secreted by gastric chief cells and undergoes autocatalytic cleavage in acidic conditions to form the active enzyme, which functions in the digestion of dietary proteins. This gene is found in a cluster of related genes on chromosome 11, each of which encodes one of multiple pepsinogens. Pepsinogen levels in serum may serve as a biomarker for atrophic gastritis and gastric cancer. | ENSG00000229859 | pepsinogen 3, group I (pepsinogen A) | NA |
| MYH11 | 4629 | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | ENSG00000133392 | myosin, heavy chain 11, smooth muscle | NA |
| HSPA9 | 3313 | This gene encodes a member of the heat shock protein 70 gene family. The encoded protein is primarily localized to the mitochondria but is also found in the endoplasmic reticulum, plasma membrane and cytoplasmic vesicles. This protein is a heat-shock cognate protein. This protein plays a role in cell proliferation, stress response and maintenance of the mitochondria. A pseudogene of this gene is found on chromosome 2. | ENSG00000113013 | heat shock protein family A (Hsp70) member 9 | NA |
| AIF1L | 83543 | NA | ENSG00000126878 | allograft inflammatory factor 1 like | NA |
| GFAP | 2670 | This gene encodes one of the major intermediate filament proteins of mature astrocytes. It is used as a marker to distinguish astrocytes from other glial cells during development. Mutations in this gene cause Alexander disease, a rare disorder of astrocytes in the central nervous system. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | ENSG00000131095 | glial fibrillary acidic protein | NA |
| EMILIN1 | 11117 | This gene encodes an extracellular matrix glycoprotein that is characterized by an N-terminal microfibril interface domain, a coiled-coiled alpha-helical domain, a collagenous domain and a C-terminal globular C1q domain. The encoded protein associates with elastic fibers at the interface between elastin and microfibrils and may play a role in the development of elastic tissues including large blood vessels, dermis, heart and lung. | ENSG00000138080 | elastin microfibril interfacer 1 | NA |
| PPP1R18 | 170954 | Protein phosphatase-1 (PP1; see MIM 176875) interacts with regulatory subunits that target the enzyme to different cellular locations and change its activity toward specific substrates. Phostensin is a regulatory subunit that targets PP1 to F-actin (see MIM 102610) cytoskeleton (Kao et al., 2007 [PubMed 17374523]). | ENSG00000146112 | protein phosphatase 1 regulatory subunit 18 | NA |
| HSP90AB1 | 3326 | This gene encodes a member of the heat shock protein 90 family; these proteins are involved in signal transduction, protein folding and degradation and morphological evolution. This gene encodes the constitutive form of the cytosolic 90 kDa heat-shock protein and is thought to play a role in gastric apoptosis and inflammation. Alternative splicing results in multiple transcript variants. Pseudogenes have been identified on multiple chromosomes. | ENSG00000096384 | heat shock protein 90kDa alpha family class B member 1 | NA |
| ACTB | 60 | This gene encodes one of six different actin proteins. Actins are highly conserved proteins that are involved in cell motility, structure, and integrity. This actin is a major constituent of the contractile apparatus and one of the two nonmuscle cytoskeletal actins. | ENSG00000075624 | actin, beta | NA |
| EIF4G1 | 1981 | The protein encoded by this gene is a component of the multi-subunit protein complex EIF4F. This complex facilitates the recruitment of mRNA to the ribosome, which is a rate-limiting step during the initiation phase of protein synthesis. The recognition of the mRNA cap and the ATP-dependent unwinding of 5’-terminal secondary structure is catalyzed by factors in this complex. The subunit encoded by this gene is a large scaffolding protein that contains binding sites for other members of the EIF4F complex. A domain at its N-terminus can also interact with the poly(A)-binding protein, which may mediate the circularization of mRNA during translation. Alternative splicing results in multiple transcript variants, some of which are derived from alternative promoter usage. | ENSG00000114867 | eukaryotic translation initiation factor 4 gamma 1 | NA |
| C10orf10 | 11067 | The expression of this gene is induced by fasting as well as by progesterone. The protein encoded by this gene contains a t-synaptosome-associated protein receptor (SNARE) coiled-coil homology domain and a peroxisomal targeting signal. Production of the encoded protein leads to phosphorylation and activation of the transcription factor ELK1. | ENSG00000165507 | chromosome 10 open reading frame 10 | NA |
| ST14 | 6768 | The protein encoded by this gene is an epithelial-derived, integral membrane serine protease. This protease forms a complex with the Kunitz-type serine protease inhibitor, HAI-1, and is found to be activated by sphingosine 1-phosphate. This protease has been shown to cleave and activate hepatocyte growth factor/scattering factor, and urokinase plasminogen activator, which suggest the function of this protease as an epithelial membrane activator for other proteases and latent growth factors. The expression of this protease has been associated with breast, colon, prostate, and ovarian tumors, which implicates its role in cancer invasion, and metastasis. | ENSG00000149418 | suppression of tumorigenicity 14 | NA |
| TPP1 | 1200 | This gene encodes a member of the sedolisin family of serine proteases. The protease functions in the lysosome to cleave N-terminal tripeptides from substrates, and has weaker endopeptidase activity. It is synthesized as a catalytically-inactive enzyme which is activated and auto-proteolyzed upon acidification. Mutations in this gene result in late-infantile neuronal ceroid lipofuscinosis, which is associated with the failure to degrade specific neuropeptides and a subunit of ATP synthase in the lysosome. | ENSG00000166340 | tripeptidyl peptidase 1 | NA |
| HBA1 | 3039 | The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | ENSG00000206172 | hemoglobin subunit alpha 1 | NA |
| RAB11FIP4 | 84440 | Proteins of the large Rab GTPase family (see RAB1A; MIM 179508) have regulatory roles in the formation, targeting, and fusion of intracellular transport vesicles. RAB11FIP4 is one of many proteins that interact with and regulate Rab GTPases (Hales et al., 2001 [PubMed 11495908]). | ENSG00000131242 | RAB11 family interacting protein 4 | NA |
| KRT2 | 3849 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | ENSG00000172867 | keratin 2 | NA |
| SLC2A3 | 6515 | NA | ENSG00000059804 | solute carrier family 2 member 3 | NA |
| PTGFRN | 5738 | NA | ENSG00000134247 | prostaglandin F2 receptor inhibitor | NA |
| DLG5 | 9231 | This gene encodes a member of the family of discs large (DLG) homologs, a subset of the membrane-associated guanylate kinase (MAGUK) superfamily. The MAGUK proteins are composed of a catalytically inactive guanylate kinase domain, in addition to PDZ and SH3 domains, and are thought to function as scaffolding molecules at sites of cell-cell contact. The protein encoded by this gene localizes to the plasma membrane and cytoplasm, and interacts with components of adherens junctions and the cytoskeleton. It is proposed to function in the transmission of extracellular signals to the cytoskeleton and in the maintenance of epithelial cell structure. Alternative splice variants have been described but their biological nature has not been determined. | ENSG00000151208 | discs large MAGUK scaffold protein 5 | NA |
| DNAJA1 | 3301 | This gene encodes a member of the DnaJ family of proteins, which act as heat shock protein 70 cochaperones. Heat shock proteins facilitate protein folding, trafficking, prevention of aggregation, and proteolytic degradation. Members of this family are characterized by a highly conserved N-terminal J domain, a glycine/phenylalanine-rich region, four CxxCxGxG zinc finger repeats, and a C-terminal substrate-binding domain. The J domain mediates the interaction with heat shock protein 70 to recruit substrates and regulate ATP hydrolysis activity. In humans, this gene has been implicated in positive regulation of virus replication through co-option by the influenza A virus. Several pseudogenes of this gene are found on other chromosomes. | ENSG00000086061 | DnaJ heat shock protein family (Hsp40) member A1 | NA |
| CXCL14 | 9547 | This antimicrobial gene belongs to the cytokine gene family which encode secreted proteins involved in immunoregulatory and inflammatory processes. The protein encoded by this gene is structurally related to the CXC (Cys-X-Cys) subfamily of cytokines. Members of this subfamily are characterized by two cysteines separated by a single amino acid. This cytokine displays chemotactic activity for monocytes but not for lymphocytes, dendritic cells, neutrophils or macrophages. It has been implicated that this cytokine is involved in the homeostasis of monocyte-derived macrophages rather than in inflammation. | ENSG00000145824 | C-X-C motif chemokine ligand 14 | NA |
| C4B | 721 | This gene encodes the basic form of complement factor 4, part of the classical activation pathway. The protein is expressed as a single chain precursor which is proteolytically cleaved into a trimer of alpha, beta, and gamma chains prior to secretion. The trimer provides a surface for interaction between the antigen-antibody complex and other complement components. The alpha chain may be cleaved to release C4 anaphylatoxin, a mediator of local inflammation. Deficiency of this protein is associated with systemic lupus erythematosus. This gene localizes to the major histocompatibility complex (MHC) class III region on chromosome 6. Varying haplotypes of this gene cluster exist, such that individuals may have 1, 2, or 3 copies of this gene. In addition, this gene exists as a long form and a short form due to the presence or absence of a 6.4 kb endogenous HERV-K retrovirus in intron 9. | ENSG00000224389 | complement component 4B (Chido blood group) | NA |
| HSD11B2 | 3291 | There are at least two isozymes of the corticosteroid 11-beta-dehydrogenase, a microsomal enzyme complex responsible for the interconversion of cortisol and cortisone. The type I isozyme has both 11-beta-dehydrogenase (cortisol to cortisone) and 11-oxoreductase (cortisone to cortisol) activities. The type II isozyme, encoded by this gene, has only 11-beta-dehydrogenase activity. In aldosterone-selective epithelial tissues such as the kidney, the type II isozyme catalyzes the glucocorticoid cortisol to the inactive metabolite cortisone, thus preventing illicit activation of the mineralocorticoid receptor. In tissues that do not express the mineralocorticoid receptor, such as the placenta and testis, it protects cells from the growth-inhibiting and/or pro-apoptotic effects of cortisol, particularly during embryonic development. Mutations in this gene cause the syndrome of apparent mineralocorticoid excess and hypertension. | ENSG00000176387 | hydroxysteroid 11-beta dehydrogenase 2 | NA |
| COL1A2 | 1278 | This gene encodes the pro-alpha2 chain of type I collagen whose triple helix comprises two alpha1 chains and one alpha2 chain. Type I is a fibril-forming collagen found in most connective tissues and is abundant in bone, cornea, dermis and tendon. Mutations in this gene are associated with osteogenesis imperfecta types I-IV, Ehlers-Danlos syndrome type VIIB, recessive Ehlers-Danlos syndrome Classical type, idiopathic osteoporosis, and atypical Marfan syndrome. Symptoms associated with mutations in this gene, however, tend to be less severe than mutations in the gene for the alpha1 chain of type I collagen (COL1A1) reflecting the different role of alpha2 chains in matrix integrity. Three transcripts, resulting from the use of alternate polyadenylation signals, have been identified for this gene. | ENSG00000164692 | collagen type I alpha 2 chain | NA |
| FN1 | 2335 | This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | ENSG00000115414 | fibronectin 1 | NA |
| PPP1R1B | 84152 | This gene encodes a bifunctional signal transduction molecule. Dopaminergic and glutamatergic receptor stimulation regulates its phosphorylation and function as a kinase or phosphatase inhibitor. As a target for dopamine, this gene may serve as a therapeutic target for neurologic and psychiatric disorders. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000131771 | protein phosphatase 1 regulatory inhibitor subunit 1B | NA |
| HIF1A | 3091 | This gene encodes the alpha subunit of transcription factor hypoxia-inducible factor-1 (HIF-1), which is a heterodimer composed of an alpha and a beta subunit. HIF-1 functions as a master regulator of cellular and systemic homeostatic response to hypoxia by activating transcription of many genes, including those involved in energy metabolism, angiogenesis, apoptosis, and other genes whose protein products increase oxygen delivery or facilitate metabolic adaptation to hypoxia. HIF-1 thus plays an essential role in embryonic vascularization, tumor angiogenesis and pathophysiology of ischemic disease. Alternatively spliced transcript variants encoding different isoforms have been identified for this gene. | ENSG00000100644 | hypoxia inducible factor 1 alpha subunit | NA |
| HIP1R | 9026 | NA | ENSG00000130787 | huntingtin interacting protein 1 related | NA |
| CTNNAL1 | 8727 | NA | ENSG00000119326 | catenin alpha like 1 | NA |
| FBLN1 | 2192 | Fibulin 1 is a secreted glycoprotein that becomes incorporated into a fibrillar extracellular matrix. Calcium-binding is apparently required to mediate its binding to laminin and nidogen. It mediates platelet adhesion via binding fibrinogen. Four splice variants which differ in the 3’ end have been identified. Each variant encodes a different isoform, but no functional distinctions have been identified among the four variants. | ENSG00000077942 | fibulin 1 | NA |
| DCN | 1634 | This gene encodes a member of the small leucine-rich proteoglycan family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature protein. This protein plays a role in collagen fibril assembly. Binding of this protein to multiple cell surface receptors mediates its role in tumor suppression, including a stimulatory effect on autophagy and inflammation and an inhibitory effect on angiogenesis and tumorigenesis. This gene and the related gene biglycan are thought to be the result of a gene duplication. Mutations in this gene are associated with congenital stromal corneal dystrophy in human patients. | ENSG00000011465 | decorin | NA |
| SOX9 | 6662 | The protein encoded by this gene recognizes the sequence CCTTGAG along with other members of the HMG-box class DNA-binding proteins. It acts during chondrocyte differentiation and, with steroidogenic factor 1, regulates transcription of the anti-Muellerian hormone (AMH) gene. Deficiencies lead to the skeletal malformation syndrome campomelic dysplasia, frequently with sex reversal. | ENSG00000125398 | SRY-box 9 | NA |
| LDLR | 3949 | The low density lipoprotein receptor (LDLR) gene family consists of cell surface proteins involved in receptor-mediated endocytosis of specific ligands. Low density lipoprotein (LDL) is normally bound at the cell membrane and taken into the cell ending up in lysosomes where the protein is degraded and the cholesterol is made available for repression of microsomal enzyme 3-hydroxy-3-methylglutaryl coenzyme A (HMG CoA) reductase, the rate-limiting step in cholesterol synthesis. At the same time, a reciprocal stimulation of cholesterol ester synthesis takes place. Mutations in this gene cause the autosomal dominant disorder, familial hypercholesterolemia. Alternate splicing results in multiple transcript variants. | ENSG00000130164 | low density lipoprotein receptor | NA |
| ADIRF | 10974 | APM2 gene is exclusively expressed in adipose tissue. Its function is currently unknown. | ENSG00000148671 | adipogenesis regulatory factor | NA |
| ZFP36 | 7538 | NA | ENSG00000128016 | ZFP36 ring finger protein | NA |
| C1S | 716 | This gene encodes a serine protease, which is a major constituent of the human complement subcomponent C1. C1s associates with two other complement components C1r and C1q in order to yield the first component of the serum complement system. Defects in this gene are the cause of selective C1s deficiency. | ENSG00000182326 | complement component 1, s subcomponent | NA |
| KANK2 | 25959 | NA | ENSG00000197256 | KN motif and ankyrin repeat domains 2 | NA |
| GP2 | 2813 | This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. | ENSG00000169347 | glycoprotein 2 | NA |
| GADD45B | 4616 | This gene is a member of a group of genes whose transcript levels are increased following stressful growth arrest conditions and treatment with DNA-damaging agents. The genes in this group respond to environmental stresses by mediating activation of the p38/JNK pathway. This activation is mediated via their proteins binding and activating MTK1/MEKK4 kinase, which is an upstream activator of both p38 and JNK MAPKs. The function of these genes or their protein products is involved in the regulation of growth and apoptosis. These genes are regulated by different mechanisms, but they are often coordinately expressed and can function cooperatively in inhibiting cell growth. | ENSG00000099860 | growth arrest and DNA damage inducible beta | NA |
| STMN1 | 3925 | This gene belongs to the stathmin family of genes. It encodes a ubiquitous cytosolic phosphoprotein proposed to function as an intracellular relay integrating regulatory signals of the cellular environment. The encoded protein is involved in the regulation of the microtubule filament system by destabilizing microtubules. It prevents assembly and promotes disassembly of microtubules. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000117632 | stathmin 1 | NA |
| TPM1 | 7168 | This gene is a member of the tropomyosin family of highly conserved, widely distributed actin-binding proteins involved in the contractile system of striated and smooth muscles and the cytoskeleton of non-muscle cells. Tropomyosin is composed of two alpha-helical chains arranged as a coiled-coil. It is polymerized end to end along the two grooves of actin filaments and provides stability to the filaments. The encoded protein is one type of alpha helical chain that forms the predominant tropomyosin of striated muscle, where it also functions in association with the troponin complex to regulate the calcium-dependent interaction of actin and myosin during muscle contraction. In smooth muscle and non-muscle cells, alternatively spliced transcript variants encoding a range of isoforms have been described. Mutations in this gene are associated with type 3 familial hypertrophic cardiomyopathy. | ENSG00000140416 | tropomyosin 1 (alpha) | NA |
| YARS | 8565 | Aminoacyl-tRNA synthetases catalyze the aminoacylation of tRNA by their cognate amino acid. Because of their central role in linking amino acids with nucleotide triplets contained in tRNAs, aminoacyl-tRNA synthetases are thought to be among the first proteins that appeared in evolution. Tyrosyl-tRNA synthetase belongs to the class I tRNA synthetase family. Cytokine activities have also been observed for the human tyrosyl-tRNA synthetase, after it is split into two parts, an N-terminal fragment that harbors the catalytic site and a C-terminal fragment found only in the mammalian enzyme. The N-terminal fragment is an interleukin-8-like cytokine, whereas the released C-terminal fragment is an EMAP II-like cytokine. | ENSG00000134684 | tyrosyl-tRNA synthetase | NA |
| IGF2R | 3482 | This gene encodes a receptor for both insulin-like growth factor 2 and mannose 6-phosphate. The binding sites for each ligand are located on different segments of the protein. This receptor has various functions, including in the intracellular trafficking of lysosomal enzymes, the activation of transforming growth factor beta, and the degradation of insulin-like growth factor 2. Mutation or loss of heterozygosity of this gene has been association with risk of hepatocellular carcinoma. The orthologous mouse gene is imprinted and shows exclusive expression from the maternal allele; however, imprinting of the human gene may be polymorphic, as only a minority of individuals showed biased expression from the maternal allele (PMID:8267611). | ENSG00000197081 | insulin like growth factor 2 receptor | NA |
| TINAGL1 | 64129 | The protein encoded by this gene is similar in sequence to tubulointerstitial nephritis antigen, a secreted glycoprotein that is recognized by antibodies in some types of immune-related tubulointerstitial nephritis. Three transcript variants encoding different isoforms have been found for this gene. | ENSG00000142910 | tubulointerstitial nephritis antigen like 1 | NA |
| FDX1 | 2230 | This gene encodes a small iron-sulfur protein that transfers electrons from NADPH through ferredoxin reductase to mitochondrial cytochrome P450, involved in steroid, vitamin D, and bile acid metabolism. Pseudogenes of this functional gene are found on chromosomes 20 and 21. | ENSG00000137714 | ferredoxin 1 | NA |
| GKN1 | 56287 | The protein encoded by this gene is found to be down-regulated in human gastric cancer tissue as compared to normal gastric mucosa. | ENSG00000169605 | gastrokine 1 | NA |
| NFIL3 | 4783 | The protein encoded by this gene is a transcriptional regulator that binds as a homodimer to activating transcription factor (ATF) sites in many cellular and viral promoters. The encoded protein represses PER1 and PER2 expression and therefore plays a role in the regulation of circadian rhythm. Three transcript variants encoding the same protein have been found for this gene. | ENSG00000165030 | nuclear factor, interleukin 3 regulated | NA |
| DDR2 | 4921 | Receptor tyrosine kinases (RTKs) play a key role in the communication of cells with their microenvironment. These molecules are involved in the regulation of cell growth, differentiation, and metabolism. In several cases the biochemical mechanism by which RTKs transduce signals across the membrane has been shown to be ligand induced receptor oligomerization and subsequent intracellular phosphorylation. This autophosphorylation leads to phosphorylation of cytosolic targets as well as association with other molecules, which are involved in pleiotropic effects of signal transduction. RTKs have a tripartite structure with extracellular, transmembrane, and cytoplasmic regions. This gene encodes a member of a novel subclass of RTKs and contains a distinct extracellular region encompassing a factor VIII-like domain. Alternative splicing in the 5’ UTR results in multiple transcript variants encoding the same protein. | ENSG00000162733 | discoidin domain receptor tyrosine kinase 2 | NA |
| CXCL8 | 3576 | The protein encoded by this gene is a member of the CXC chemokine family. This chemokine is one of the major mediators of the inflammatory response. This chemokine is secreted by several cell types. It functions as a chemoattractant, and is also a potent angiogenic factor. This gene is believed to play a role in the pathogenesis of bronchiolitis, a common respiratory tract disease caused by viral infection. This gene and other ten members of the CXC chemokine gene family form a chemokine gene cluster in a region mapped to chromosome 4q. | ENSG00000169429 | C-X-C motif chemokine ligand 8 | NA |
| TXLNA | 200081 | NA | ENSG00000084652 | taxilin alpha | NA |
| C4A | 720 | This gene encodes the acidic form of complement factor 4, part of the classical activation pathway. The protein is expressed as a single chain precursor which is proteolytically cleaved into a trimer of alpha, beta, and gamma chains prior to secretion. The trimer provides a surface for interaction between the antigen-antibody complex and other complement components. The alpha chain is cleaved to release C4 anaphylatoxin, an antimicrobial peptide and a mediator of local inflammation. Deficiency of this protein is associated with systemic lupus erythematosus and type I diabetes mellitus. This gene localizes to the major histocompatibility complex (MHC) class III region on chromosome 6. Varying haplotypes of this gene cluster exist, such that individuals may have 1, 2, or 3 copies of this gene. Two transcript variants encoding different isoforms have been found for this gene. | ENSG00000244731 | complement component 4A (Rodgers blood group) | NA |
| SPARCL1 | 8404 | NA | ENSG00000152583 | SPARC like 1 | NA |
| SPINT1 | 6692 | The protein encoded by this gene is a member of the Kunitz family of serine protease inhibitors. The protein is a potent inhibitor specific for HGF activator and is thought to be involved in the regulation of the proteolytic activation of HGF in injured tissues. Alternative splicing results in multiple variants encoding different isoforms. | ENSG00000166145 | serine peptidase inhibitor, Kunitz type 1 | NA |
| SBNO2 | 22904 | NA | ENSG00000064932 | strawberry notch homolog 2 (Drosophila) | NA |
| CDKN1A | 1026 | This gene encodes a potent cyclin-dependent kinase inhibitor. The encoded protein binds to and inhibits the activity of cyclin-cyclin-dependent kinase2 or -cyclin-dependent kinase4 complexes, and thus functions as a regulator of cell cycle progression at G1. The expression of this gene is tightly controlled by the tumor suppressor protein p53, through which this protein mediates the p53-dependent cell cycle G1 phase arrest in response to a variety of stress stimuli. This protein can interact with proliferating cell nuclear antigen, a DNA polymerase accessory factor, and plays a regulatory role in S phase DNA replication and DNA damage repair. This protein was reported to be specifically cleaved by CASP3-like caspases, which thus leads to a dramatic activation of cyclin-dependent kinase2, and may be instrumental in the execution of apoptosis following caspase activation. Mice that lack this gene have the ability to regenerate damaged or missing tissue. Multiple alternatively spliced variants have been found for this gene. | ENSG00000124762 | cyclin-dependent kinase inhibitor 1A | NA |
| IL4R | 3566 | This gene encodes the alpha chain of the interleukin-4 receptor, a type I transmembrane protein that can bind interleukin 4 and interleukin 13 to regulate IgE production. The encoded protein also can bind interleukin 4 to promote differentiation of Th2 cells. A soluble form of the encoded protein can be produced by proteolysis of the membrane-bound protein, and this soluble form can inhibit IL4-mediated cell proliferation and IL5 upregulation by T-cells. Allelic variations in this gene have been associated with atopy, a condition that can manifest itself as allergic rhinitis, sinusitus, asthma, or eczema. Polymorphisms in this gene are also associated with resistance to human immunodeficiency virus type-1 infection. Alternate splicing results in multiple transcript variants. | ENSG00000077238 | interleukin 4 receptor | NA |
| URB1 | 9875 | NA | ENSG00000142207 | URB1 ribosome biogenesis 1 homolog (S. cerevisiae) | NA |
| REG1B | 5968 | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV based on the primary structures of the encoded proteins. This gene encodes a protein secreted by the exocrine pancreas that is highly similar to the REG1A protein. The related REG1A protein is associated with islet cell regeneration and diabetogenesis, and may be involved in pancreatic lithogenesis. Reg family members REG1A, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | ENSG00000172023 | regenerating family member 1 beta | NA |
| TFCP2L1 | 29842 | NA | ENSG00000115112 | transcription factor CP2-like 1 | NA |
| CRACR2B | 283229 | NA | ENSG00000177685 | calcium release activated channel regulator 2B | NA |
| ARMC9 | 80210 | NA | ENSG00000135931 | armadillo repeat containing 9 | NA |
| ANXA1 | 301 | This gene encodes a membrane-localized protein that binds phospholipids. This protein inhibits phospholipase A2 and has anti-inflammatory activity. Loss of function or expression of this gene has been detected in multiple tumors. | ENSG00000135046 | annexin A1 | NA |
| KRT1 | 3848 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | ENSG00000167768 | keratin 1 | NA |
| PRSS1 | 5644 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | ENSG00000204983 | protease, serine 1 | NA |
| LAMB2 | 3913 | Laminins, a family of extracellular matrix glycoproteins, are the major noncollagenous constituent of basement membranes. They have been implicated in a wide variety of biological processes including cell adhesion, differentiation, migration, signaling, neurite outgrowth and metastasis. Laminins, composed of 3 non identical chains: laminin alpha, beta and gamma (formerly A, B1, and B2, respectively), form a cruciform structure consisting of 3 short arms, each formed by a different chain, and a long arm composed of all 3 chains. Each laminin chain is a multidomain protein encoded by a distinct gene. Several isoforms of each chain have been described. Different alpha, beta and gamma chain isomers combine to give rise to different heterotrimeric laminin isoforms which are designated by Arabic numerals in the order of their discovery, i.e. alpha1beta1gamma1 heterotrimer is laminin 1. The biological functions of the different chains and trimer molecules are largely unknown, but some of the chains have been shown to differ with respect to their tissue distribution, presumably reflecting diverse functions in vivo. This gene encodes the beta chain isoform laminin, beta 2. The beta 2 chain contains the 7 structural domains typical of beta chains of laminin, including the short alpha region. However, unlike beta 1 chain, beta 2 has a more restricted tissue distribution. It is enriched in the basement membrane of muscles at the neuromuscular junctions, kidney glomerulus and vascular smooth muscle. Transgenic mice in which the beta 2 chain gene was inactivated by homologous recombination, showed defects in the maturation of neuromuscular junctions and impairment of glomerular filtration. Alternative splicing involving a non consensus 5’ splice site (gc) in the 5’ UTR of this gene has been reported. It was suggested that inefficient splicing of this first intron, which does not change the protein sequence, results in a greater abundance of the unspliced form of the transcript than the spliced form. The full-length nature of the spliced transcript is not known. | ENSG00000172037 | laminin subunit beta 2 | NA |
| MARCKSL1 | 65108 | This gene encodes a member of the myristoylated alanine-rich C-kinase substrate (MARCKS) family. Members of this family play a role in cytoskeletal regulation, protein kinase C signaling and calmodulin signaling. The encoded protein affects the formation of adherens junction. Alternative splicing results in multiple transcript variants. Pseudogenes of this gene are located on the long arm of chromosomes 6 and 10. | ENSG00000175130 | MARCKS like 1 | NA |
| CPA1 | 1357 | This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. | ENSG00000091704 | carboxypeptidase A1 | NA |
| LGALS4 | 3960 | The galectins are a family of beta-galactoside-binding proteins implicated in modulating cell-cell and cell-matrix interactions. The expression of this gene is restricted to small intestine, colon, and rectum, and it is underexpressed in colorectal cancer. | ENSG00000171747 | galectin 4 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_sqrt/gene_names_clus_",20,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
lambda_out <- read.table("../sfa_outputs/GTEX2013_transpose/voom_gtex/gtex_voom_transpose_lambda.out");
f_out <- read.table("../sfa_outputs/GTEX2013_transpose/voom_gtex/gtex_voom_transpose_F.out");
gene_names <- as.vector(as.matrix(read.table("../sfa_inputs/gene_names_GTEX_V6.txt")));
gene_names <- substring(gene_names,1,15);
xli <- gene_names;
indices_mat <- SFA.ExtractTopFeatures(lambda_out, top_features = 100, options="min", mult.annotate = TRUE)
gene_list <- do.call(rbind, lapply(1:dim(indices_mat)[1], function(x) gene_names[indices_mat[x,]]))
out <- mygene::queryMany(gene_list[1,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| name | summary | X_id | query | symbol | notfound |
|---|---|---|---|---|---|
| ankyrin repeat domain 1 | The protein encoded by this gene is localized to the nucleus of endothelial cells and is induced by IL-1 and TNF-alpha stimulation. Studies in rat cardiomyocytes suggest that this gene functions as a transcription factor. Interactions between this protein and the sarcomeric proteins myopalladin and titin suggest that it may also be involved in the myofibrillar stretch-sensor system. | 27063 | ENSG00000148677 | ANKRD1 | NA |
| ankyrin repeat domain 2 | This gene encodes a protein that belongs to the muscle ankyrin repeat protein (MARP) family. A similar gene in rodents is a component of a muscle stress response pathway and plays a role in the stretch-response associated with slow muscle function. Alternative splicing results in multiple transcript variants encoding different isoforms. | 26287 | ENSG00000165887 | ANKRD2 | NA |
| actin, alpha 1, skeletal muscle | The product encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Mutations in this gene cause nemaline myopathy type 3, congenital myopathy with excess of thin myofilaments, congenital myopathy with cores, and congenital myopathy with fiber-type disproportion, diseases that lead to muscle fiber defects. | 58 | ENSG00000143632 | ACTA1 | NA |
| troponin C1, slow skeletal and cardiac type | Troponin is a central regulatory protein of striated muscle contraction, and together with tropomyosin, is located on the actin filament. Troponin consists of 3 subunits: TnI, which is the inhibitor of actomyosin ATPase; TnT, which contains the binding site for tropomyosin; and TnC, the protein encoded by this gene. The binding of calcium to TnC abolishes the inhibitory action of TnI, thus allowing the interaction of actin with myosin, the hydrolysis of ATP, and the generation of tension. Mutations in this gene are associated with cardiomyopathy dilated type 1Z. | 7134 | ENSG00000114854 | TNNC1 | NA |
| NA | NA | ENSG00000215861 | ENSG00000215861 | WI2-1896O14.1 | NA |
| myoglobin | This gene encodes a member of the globin superfamily and is expressed in skeletal and cardiac muscles. The encoded protein is a haemoprotein contributing to intracellular oxygen storage and transcellular facilitated diffusion of oxygen. At least three alternatively spliced transcript variants encoding the same protein have been reported. | 4151 | ENSG00000198125 | MB | NA |
| myosin, heavy chain 7, cardiac muscle, beta | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | 4625 | ENSG00000092054 | MYH7 | NA |
| CD22 molecule | NA | 933 | ENSG00000012124 | CD22 | NA |
| NA | NA | ENSG00000258444 | ENSG00000258444 | CTD-2201G16.1 | NA |
| dedicator of cytokinesis 8 | This gene encodes a member of the DOCK180 family of guanine nucleotide exchange factors. Guanine nucleotide exchange factors interact with Rho GTPases and are components of intracellular signaling networks. Mutations in this gene result in the autosomal recessive form of the hyper-IgE syndrome. Alternatively spliced transcript variants encoding different isoforms have been described. | 81704 | ENSG00000107099 | DOCK8 | NA |
| myosin light chain 2 | Thus gene encodes the regulatory light chain associated with cardiac myosin beta (or slow) heavy chain. Ca+ triggers the phosphorylation of regulatory light chain that in turn triggers contraction. Mutations in this gene are associated with mid-left ventricular chamber type hypertrophic cardiomyopathy. | 4633 | ENSG00000111245 | MYL2 | NA |
| cytochrome c oxidase subunit 6A2 | Cytochrome c oxidase (COX), the terminal enzyme of the mitochondrial respiratory chain, catalyzes the electron transfer from reduced cytochrome c to oxygen. It is a heteromeric complex consisting of 3 catalytic subunits encoded by mitochondrial genes and multiple structural subunits encoded by nuclear genes. The mitochondrially-encoded subunits function in electron transfer, and the nuclear-encoded subunits may be involved in the regulation and assembly of the complex. This nuclear gene encodes polypeptide 2 (heart/muscle isoform) of subunit VIa, and polypeptide 2 is present only in striated muscles. Polypeptide 1 (liver isoform) of subunit VIa is encoded by a different gene, and is found in all non-muscle tissues. These two polypeptides share 66% amino acid sequence identity. | 1339 | ENSG00000156885 | COX6A2 | NA |
| cysteine and glycine rich protein 3 | This gene encodes a member of the CSRP family of LIM domain proteins, which may be involved in regulatory processes important for development and cellular differentiation. The LIM/double zinc-finger motif found in this protein is found in a group of proteins with critical functions in gene regulation, cell growth, and somatic differentiation. Mutations in this gene are thought to cause heritable forms of hypertrophic cardiomyopathy (HCM) and dilated cardiomyopathy (DCM) in humans. Alternatively spliced transcript variants with different 5’ UTR, but encoding the same protein, have been found for this gene. | 8048 | ENSG00000129170 | CSRP3 | NA |
| ATP binding cassette subfamily B member 1 | The membrane-associated protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intra-cellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the MDR/TAP subfamily. Members of the MDR/TAP subfamily are involved in multidrug resistance. The protein encoded by this gene is an ATP-dependent drug efflux pump for xenobiotic compounds with broad substrate specificity. It is responsible for decreased drug accumulation in multidrug-resistant cells and often mediates the development of resistance to anticancer drugs. This protein also functions as a transporter in the blood-brain barrier. | 5243 | ENSG00000085563 | ABCB1 | NA |
| creatine kinase, M-type | The protein encoded by this gene is a cytoplasmic enzyme involved in energy homeostasis and is an important serum marker for myocardial infarction. The encoded protein reversibly catalyzes the transfer of phosphate between ATP and various phosphogens such as creatine phosphate. It acts as a homodimer in striated muscle as well as in other tissues, and as a heterodimer with a similar brain isozyme in heart. The encoded protein is a member of the ATP:guanido phosphotransferase protein family. | 1158 | ENSG00000104879 | CKM | NA |
| cytokine receptor like factor 1 | This gene encodes a member of the cytokine type I receptor family. The protein forms a secreted complex with cardiotrophin-like cytokine factor 1 and acts on cells expressing ciliary neurotrophic factor receptors. The complex can promote survival of neuronal cells. Mutations in this gene result in Crisponi syndrome and cold-induced sweating syndrome. | 9244 | ENSG00000006016 | CRLF1 | NA |
| actin, alpha, cardiac muscle 1 | Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC). | 70 | ENSG00000159251 | ACTC1 | NA |
| titin-cap | Sarcomere assembly is regulated by the muscle protein titin. Titin is a giant elastic protein with kinase activity that extends half the length of a sarcomere. It serves as a scaffold to which myofibrils and other muscle related proteins are attached. This gene encodes a protein found in striated and cardiac muscle that binds to the titin Z1-Z2 domains and is a substrate of titin kinase, interactions thought to be critical to sarcomere assembly. Mutations in this gene are associated with limb-girdle muscular dystrophy type 2G. | 8557 | ENSG00000173991 | TCAP | NA |
| Kazal type serine peptidase inhibitor domain 1 | This gene encodes a secreted member of the insulin growth factor-binding protein (IGFBP) superfamily. The protein contains an insulin growth factor-binding domain in its N-terminal region, a Kazal-type serine protease inhibitor and follistatin-like domain in its central region, and an immunoglobulin-like domain in its C-terminal region. Studies of the mouse ortholog suggest that this protein may function in bone development and bone regeneration. This gene is hypomethylated and over-expressed in high-grade glioma compared to low-grade glioma, and thus the hypomethylated gene may be associated with cell proliferation and the shorter survival of patients with high-grade glioma. It is also one of numerous genes found to be deleted in a novel 5.54 Mb interstitial deletion, which is associated with multiple congenital anomalies. Alternative splicing results in multiple transcript variants. | 81621 | ENSG00000107821 | KAZALD1 | NA |
| ADP-ribosylhydrolase like 1 | ADP-ribosylation is a reversible posttranslational modification used to regulate protein function. ADP-ribosyltransferases (see ART1; MIM 601625) transfer ADP-ribose from NAD+ to the target protein, and ADP-ribosylhydrolases, such as ADPRHL1, reverse the reaction (Glowacki et al., 2002 [PubMed 12070318]). | 113622 | ENSG00000153531 | ADPRHL1 | NA |
| G protein-coupled receptor 183 | This gene was identified by the up-regulation of its expression upon Epstein-Barr virus infection of primary B lymphocytes. This gene is predicted to encode a G protein-coupled receptor that is most closely related to the thrombin receptor. Expression of this gene was detected in B-lymphocyte cell lines and lymphoid tissues but not in T-lymphocyte cell lines or peripheral blood T lymphocytes. The function of this gene is unknown. | 1880 | ENSG00000169508 | GPR183 | NA |
| whirlin | This gene is thought to function in the organization and stabilization of sterocilia elongation and actin cystoskeletal assembly, based on studies of the related mouse gene. Mutations in this gene have been associated with autosomal recessive non-syndromic deafness and Usher Syndrome. Alternative splicing of this gene results in multiple transcript variants encoding different isoforms. | 25861 | ENSG00000095397 | WHRN | NA |
| integrin subunit beta like 1 | This gene encodes a beta integrin-related protein that is a member of the EGF-like protein family. The encoded protein contains integrin-like cysteine-rich repeats. Alternative splicing results in multiple transcript variants. | 9358 | ENSG00000198542 | ITGBL1 | NA |
| NA | NA | NA | ENSG00000180672 | NA | TRUE |
| bone marrow stromal cell antigen 2 | Bone marrow stromal cells are involved in the growth and development of B-cells. The specific function of the protein encoded by the bone marrow stromal cell antigen 2 is undetermined; however, this protein may play a role in pre-B-cell growth and in rheumatoid arthritis. | 684 | ENSG00000130303 | BST2 | NA |
| myozenin 2 | The protein encoded by this gene belongs to a family of sarcomeric proteins that bind to calcineurin, a phosphatase involved in calcium-dependent signal transduction in diverse cell types. These family members tether calcineurin to alpha-actinin at the z-line of the sarcomere of cardiac and skeletal muscle cells, and thus they are important for calcineurin signaling. Mutations in this gene cause cardiomyopathy familial hypertrophic type 16, a hereditary heart disorder. | 51778 | ENSG00000172399 | MYOZ2 | NA |
| protein tyrosine phosphatase, non-receptor type 3 | The protein encoded by this gene is a member of the protein tyrosine phosphatase (PTP) family. PTPs are known to be signaling molecules that regulate a variety of cellular processes including cell growth, differentiation, mitotic cycle, and oncogenic transformation. This protein contains a C-terminal PTP domain and an N-terminal domain homologous to the band 4.1 superfamily of cytoskeletal-associated proteins. P97, a cell cycle regulator involved in a variety of membrane related functions, has been shown to be a substrate of this PTP. This PTP was also found to interact with, and be regulated by adaptor protein 14-3-3 beta. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 5774 | ENSG00000070159 | PTPN3 | NA |
| colorectal neoplasia differentially expressed (non-protein coding) | NA | ENSG00000245694 | ENSG00000245694 | CRNDE | NA |
| ankyrin 1 | Ankyrins are a family of proteins that link the integral membrane proteins to the underlying spectrin-actin cytoskeleton and play key roles in activities such as cell motility, activation, proliferation, contact and the maintenance of specialized membrane domains. Multiple isoforms of ankyrin with different affinities for various target proteins are expressed in a tissue-specific, developmentally regulated manner. Most ankyrins are typically composed of three structural domains: an amino-terminal domain containing multiple ankyrin repeats; a central region with a highly conserved spectrin binding domain; and a carboxy-terminal regulatory domain which is the least conserved and subject to variation. Ankyrin 1, the prototype of this family, was first discovered in the erythrocytes, but since has also been found in brain and muscles. Mutations in erythrocytic ankyrin 1 have been associated in approximately half of all patients with hereditary spherocytosis. Complex patterns of alternative splicing in the regulatory domain, giving rise to different isoforms of ankyrin 1 have been described. Truncated muscle-specific isoforms of ankyrin 1 resulting from usage of an alternate promoter have also been identified. | 286 | ENSG00000029534 | ANK1 | NA |
| nebulin related anchoring protein | NA | 4892 | ENSG00000197893 | NRAP | NA |
| uncharacterized LOC105370792 | NA | 105370792 | ENSG00000174171 | LOC105370792 | NA |
| NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, 4-like 2 | NA | 56901 | ENSG00000185633 | NDUFA4L2 | NA |
| Purkinje cell protein 4 like 1 | NA | 654790 | ENSG00000248485 | PCP4L1 | NA |
| guanylate binding protein 2 | This gene belongs to the guanine-binding protein (GBP) family, which includes interferon-induced proteins that can bind to guanine nucleotides (GMP, GDP and GTP). The encoded protein is a GTPase which hydrolyzes GTP, predominantly to GDP. The protein may play a role as a marker of squamous cell carcinomas. | 2634 | ENSG00000162645 | GBP2 | NA |
| death associated protein kinase 1 | Death-associated protein kinase 1 is a positive mediator of gamma-interferon induced programmed cell death. DAPK1 encodes a structurally unique 160-kD calmodulin dependent serine-threonine kinase that carries 8 ankyrin repeats and 2 putative P-loop consensus sites. It is a tumor suppressor candidate. Alternative splicing results in multiple transcript variants. | 1612 | ENSG00000196730 | DAPK1 | NA |
| transient receptor potential cation channel subfamily M member 4 | The protein encoded by this gene is a calcium-activated nonselective ion channel that mediates transport of monovalent cations across membranes, thereby depolarizing the membrane. The activity of the encoded protein increases with increasing intracellular calcium concentration, but this channel does not transport calcium. | 54795 | ENSG00000130529 | TRPM4 | NA |
| sushi domain containing 2 | NA | 56241 | ENSG00000099994 | SUSD2 | NA |
| NA | NA | NA | ENSG00000269640 | NA | TRUE |
| NA | NA | ENSG00000250654 | ENSG00000250654 | RP11-834C11.7 | NA |
| phosphodiesterase 4D interacting protein | The protein encoded by this gene serves to anchor phosphodiesterase 4D to the Golgi/centrosome region of the cell. Defects in this gene may be a cause of myeloproliferative disorder (MBD) associated with eosinophilia. Several transcript variants encoding different isoforms have been found for this gene. | 9659 | ENSG00000178104 | PDE4DIP | NA |
| NA | NA | ENSG00000250900 | ENSG00000250900 | CTC-338M12.6 | NA |
| RAS like family 11 member B | RASL11B is a member of the small GTPase protein family with a high degree of similarity to RAS (see HRAS, MIM 190020) proteins. | 65997 | ENSG00000128045 | RASL11B | NA |
| interleukin 17 receptor E | This gene encodes a transmembrane protein that functions as the receptor for interleukin-17C. The encoded protein signals to downstream components of the mitogen activated protein kinase (MAPK) pathway. Activity of this protein is important in the immune response to bacterial pathogens. Alternatively spliced transcript variants have been described for this gene. | 132014 | ENSG00000163701 | IL17RE | NA |
| neuropilin 2 | This gene encodes a member of the neuropilin family of receptor proteins. The encoded transmembrane protein binds to SEMA3C protein {sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3C} and SEMA3F protein {sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3F}, and interacts with vascular endothelial growth factor (VEGF). This protein may play a role in cardiovascular development, axon guidance, and tumorigenesis. Multiple transcript variants encoding distinct isoforms have been identified for this gene. | 8828 | ENSG00000118257 | NRP2 | NA |
| tripartite motif containing 54 | The protein encoded by this gene contains a RING finger motif and is highly similar to the ring finger proteins RNF28/MURF1 and RNF29/MURF2. In vitro studies demonstrated that this protein, RNF28, and RNF29 form heterodimers, which may be important for the regulation of titin kinase and microtubule-dependent signal pathways in striated muscles. Alternatively spliced transcript variants encoding distinct isoforms have been reported. | 57159 | ENSG00000138100 | TRIM54 | NA |
| cytochrome c oxidase subunit 7A1 | Cytochrome c oxidase (COX), the terminal component of the mitochondrial respiratory chain, catalyzes the electron transfer from reduced cytochrome c to oxygen. This component is a heteromeric complex consisting of 3 catalytic subunits encoded by mitochondrial genes and multiple structural subunits encoded by nuclear genes. The mitochondrially-encoded subunits function in electron transfer, and the nuclear-encoded subunits may function in the regulation and assembly of the complex. This nuclear gene encodes polypeptide 1 (muscle isoform) of subunit VIIa and the polypeptide 1 is present only in muscle tissues. Other polypeptides of subunit VIIa are present in both muscle and nonmuscle tissues, and are encoded by different genes. | 1346 | ENSG00000161281 | COX7A1 | NA |
| troponin T1, slow skeletal type | This gene encodes a protein that is a subunit of troponin, which is a regulatory complex located on the thin filament of the sarcomere. This complex regulates striated muscle contraction in response to fluctuations in intracellular calcium concentration. This complex is composed of three subunits: troponin C, which binds calcium, troponin T, which binds tropomyosin, and troponin I, which is an inhibitory subunit. This protein is the slow skeletal troponin T subunit. Mutations in this gene cause nemaline myopathy type 5, also known as Amish nemaline myopathy, a neuromuscular disorder characterized by muscle weakness and rod-shaped, or nemaline, inclusions in skeletal muscle fibers which affects infants, resulting in death due to respiratory insufficiency, usually in the second year. Multiple transcript variants encoding different isoforms have been found for this gene. | 7138 | ENSG00000105048 | TNNT1 | NA |
| regulator of calcineurin 2 | This gene encodes a member of the regulator of calcineurin (RCAN) protein family. These proteins play a role in many physiological processes by binding to the catalytic domain of calcineurin A, inhibiting calcineurin-mediated nuclear translocation of the transcription factor NFATC1. Expression of this gene in skin fibroblasts is upregulated by thyroid hormone, and the encoded protein may also play a role in endothelial cell function and angiogenesis. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 10231 | ENSG00000172348 | RCAN2 | NA |
| tripartite motif containing 7 | The protein encoded by this gene is a member of the tripartite motif (TRIM) family. The TRIM motif includes three zinc-binding domains, a RING, a B-box type 1, a B-box type 2, and a coiled-coil region. The protein localizes to both the nucleus and the cytoplasm, and may represent a participant in the initiation of glycogen synthesis. Alternative splicing results in multiple transcript variants. | 81786 | ENSG00000146054 | TRIM7 | NA |
| crystallin alpha B | Mammalian lens crystallins are divided into alpha, beta, and gamma families. Alpha crystallins are composed of two gene products: alpha-A and alpha-B, for acidic and basic, respectively. Alpha crystallins can be induced by heat shock and are members of the small heat shock protein (HSP20) family. They act as molecular chaperones although they do not renature proteins and release them in the fashion of a true chaperone; instead they hold them in large soluble aggregates. Post-translational modifications decrease the ability to chaperone. These heterogeneous aggregates consist of 30-40 subunits; the alpha-A and alpha-B subunits have a 3:1 ratio, respectively. Two additional functions of alpha crystallins are an autokinase activity and participation in the intracellular architecture. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. Alpha-A and alpha-B gene products are differentially expressed; alpha-A is preferentially restricted to the lens and alpha-B is expressed widely in many tissues and organs. Elevated expression of alpha-B crystallin occurs in many neurological diseases; a missense mutation cosegregated in a family with a desmin-related myopathy. Alternative splicing results in multiple transcript variants. | 1410 | ENSG00000109846 | CRYAB | NA |
| pleckstrin | NA | 5341 | ENSG00000115956 | PLEK | NA |
| Ras association domain family member 2 | This gene encodes a protein that contains a Ras association domain. Similar to its cattle and sheep counterparts, this gene is located near the prion gene. Two alternatively spliced transcripts encoding the same isoform have been reported. | 9770 | ENSG00000101265 | RASSF2 | NA |
| latent transforming growth factor beta binding protein 2 | The protein encoded by this gene belongs to the family of latent transforming growth factor (TGF)-beta binding proteins (LTBP), which are extracellular matrix proteins with multi-domain structure. This protein is the largest member of the LTBP family possessing unique regions and with most similarity to the fibrillins. It has thus been suggested that it may have multiple functions: as a member of the TGF-beta latent complex, as a structural component of microfibrils, and a role in cell adhesion. | 4053 | ENSG00000119681 | LTBP2 | NA |
| heat shock protein family A (Hsp70) member 7 | NA | ENSG00000225217 | ENSG00000225217 | HSPA7 | NA |
| fatty acid binding protein 3 | The intracellular fatty acid-binding proteins (FABPs) belongs to a multigene family. FABPs are divided into at least three distinct types, namely the hepatic-, intestinal- and cardiac-type. They form 14-15 kDa proteins and are thought to participate in the uptake, intracellular metabolism and/or transport of long-chain fatty acids. They may also be responsible in the modulation of cell growth and proliferation. Fatty acid-binding protein 3 gene contains four exons and its function is to arrest growth of mammary epithelial cells. This gene is a candidate tumor suppressor gene for human breast cancer. Alternative splicing results in multiple transcript variants. | 2170 | ENSG00000121769 | FABP3 | NA |
| phosphatidylethanolamine binding protein 4 | The phosphatidylethanolamine (PE)-binding proteins, including PEBP4, are an evolutionarily conserved family of proteins with pivotal biologic functions, such as lipid binding and inhibition of serine proteases (Wang et al., 2004 [PubMed 15302887]). | 157310 | ENSG00000134020 | PEBP4 | NA |
| tubulin beta 4A class IVa | This gene encodes a member of the beta tubulin family. Beta tubulins are one of two core protein families (alpha and beta tubulins) that heterodimerize and assemble to form microtubules. Mutations in this gene cause hypomyelinating leukodystrophy-6 and autosomal dominant torsion dystonia-4. Alternate splicing results in multiple transcript variants encoding different isoforms. A pseudogene of this gene is found on chromosome X. | 10382 | ENSG00000104833 | TUBB4A | NA |
| G protein-coupled receptor 176 | Members of the G protein-coupled receptor family, such as GPR176, are cell surface receptors involved in responses to hormones, growth factors, and neurotransmitters (Hata et al., 1995 [PubMed 7893747]). | 11245 | ENSG00000166073 | GPR176 | NA |
| dickkopf WNT signaling pathway inhibitor 3 | This gene encodes a protein that is a member of the dickkopf family. The secreted protein contains two cysteine rich regions and is involved in embryonic development through its interactions with the Wnt signaling pathway. The expression of this gene is decreased in a variety of cancer cell lines and it may function as a tumor suppressor gene. Alternative splicing results in multiple transcript variants encoding the same protein. | 27122 | ENSG00000050165 | DKK3 | NA |
| transmembrane protein 182 | NA | 130827 | ENSG00000170417 | TMEM182 | NA |
| tumor necrosis factor receptor superfamily member 12A | NA | 51330 | ENSG00000006327 | TNFRSF12A | NA |
| pleckstrin and Sec7 domain containing | This gene encodes a Plekstrin homology and SEC7 domains-containing protein that functions as a guanine nucleotide exchange factor. The encoded protein regulates signal transduction by activating ADP-ribosylation factor 6. Alternative splicing results in multiple transcript variants. | 5662 | ENSG00000059915 | PSD | NA |
| integrin subunit alpha M | This gene encodes the integrin alpha M chain. Integrins are heterodimeric integral membrane proteins composed of an alpha chain and a beta chain. This I-domain containing alpha integrin combines with the beta 2 chain (ITGB2) to form a leukocyte-specific integrin referred to as macrophage receptor 1 (‘Mac-1’), or inactivated-C3b (iC3b) receptor 3 (‘CR3’). The alpha M beta 2 integrin is important in the adherence of neutrophils and monocytes to stimulated endothelium, and also in the phagocytosis of complement coated particles. Multiple transcript variants encoding different isoforms have been found for this gene. | 3684 | ENSG00000169896 | ITGAM | NA |
| regulator of G-protein signaling 1 | This gene encodes a member of the regulator of G-protein signalling family. This protein is located on the cytosolic side of the plasma membrane and contains a conserved, 120 amino acid motif called the RGS domain. The protein attenuates the signalling activity of G-proteins by binding to activated, GTP-bound G alpha subunits and acting as a GTPase activating protein (GAP), increasing the rate of conversion of the GTP to GDP. This hydrolysis allows the G alpha subunits to bind G beta/gamma subunit heterodimers, forming inactive G-protein heterotrimers, thereby terminating the signal. | 5996 | ENSG00000090104 | RGS1 | NA |
| regulator of G-protein signaling 9 | This gene encodes a member of the RGS family of GTPase activating proteins that function in various signaling pathways by accelerating the deactivation of G proteins. This protein is anchored to photoreceptor membranes in retinal cells and deactivates G proteins in the rod and cone phototransduction cascades. Mutations in this gene result in bradyopsia. Multiple transcript variants encoding different isoforms have been found for this gene. | 8787 | ENSG00000108370 | RGS9 | NA |
| NA | NA | ENSG00000272463 | ENSG00000272463 | RP11-532F6.3 | NA |
| LIM and cysteine rich domains 1 | This gene encodes a member of the LIM-domain family of zinc finger proteins. The encoded protein contains an N-terminal cysteine-rich domain and two C-terminal LIM domains. The presence of LIM domains suggests involvement in protein-protein interactions. The protein may act as a co-regulator of transcription along with other transcription factors. Alternate splicing results in multiple transcript variants of this gene. | 29995 | ENSG00000071282 | LMCD1 | NA |
| growth arrest specific 6 | This gene encodes a gamma-carboxyglutamic acid (Gla)-containing protein thought to be involved in the stimulation of cell proliferation. This gene is frequently overexpressed in many cancers and has been implicated as an adverse prognostic marker. Elevated protein levels are additionally associated with a variety of disease states, including venous thromboembolic disease, systemic lupus erythematosus, chronic renal failure, and preeclampsia. | 2621 | ENSG00000183087 | GAS6 | NA |
| uncharacterized LOC100507002 | NA | 100507002 | ENSG00000263470 | LOC100507002 | NA |
| myosin, heavy chain 6, cardiac muscle, alpha | Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. | 4624 | ENSG00000197616 | MYH6 | NA |
| CD53 molecule | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. This encoded protein is a cell surface glycoprotein that is known to complex with integrins. It contributes to the transduction of CD2-generated signals in T cells and natural killer cells and has been suggested to play a role in growth regulation. Familial deficiency of this gene has been linked to an immunodeficiency associated with recurrent infectious diseases caused by bacteria, fungi and viruses. Alternative splicing results in multiple transcript variants. | 963 | ENSG00000143119 | CD53 | NA |
| ATPase Na+/K+ transporting subunit beta 2 | The protein encoded by this gene belongs to the family of Na+/K+ and H+/K+ ATPases beta chain proteins, and to the subfamily of Na+/K+ -ATPases. Na+/K+ -ATPase is an integral membrane protein responsible for establishing and maintaining the electrochemical gradients of Na and K ions across the plasma membrane. These gradients are essential for osmoregulation, for sodium-coupled transport of a variety of organic and inorganic molecules, and for electrical excitability of nerve and muscle. This enzyme is composed of two subunits, a large catalytic subunit (alpha) and a smaller glycoprotein subunit (beta). The beta subunit regulates, through assembly of alpha/beta heterodimers, the number of sodium pumps transported to the plasma membrane. The glycoprotein subunit of Na+/K+ -ATPase is encoded by multiple genes. This gene encodes a beta 2 subunit. Two transcript variants encoding different isoforms have been found for this gene. | 482 | ENSG00000129244 | ATP1B2 | NA |
| creatine kinase, mitochondrial 2 | Mitochondrial creatine kinase (MtCK) is responsible for the transfer of high energy phosphate from mitochondria to the cytosolic carrier, creatine. It belongs to the creatine kinase isoenzyme family. It exists as two isoenzymes, sarcomeric MtCK and ubiquitous MtCK, encoded by separate genes. Mitochondrial creatine kinase occurs in two different oligomeric forms: dimers and octamers, in contrast to the exclusively dimeric cytosolic creatine kinase isoenzymes. Sarcomeric mitochondrial creatine kinase has 80% homology with the coding exons of ubiquitous mitochondrial creatine kinase. This gene contains sequences homologous to several motifs that are shared among some nuclear genes encoding mitochondrial proteins and thus may be essential for the coordinated activation of these genes during mitochondrial biogenesis. Three transcript variants encoding the same protein have been found for this gene. | 1160 | ENSG00000131730 | CKMT2 | NA |
| REST corepressor 2 | NA | 283248 | ENSG00000167771 | RCOR2 | NA |
| NA | NA | ENSG00000225792 | ENSG00000225792 | AC004540.4 | NA |
| titin | This gene encodes a large abundant protein of striated muscle. The product of this gene is divided into two regions, a N-terminal I-band and a C-terminal A-band. The I-band, which is the elastic part of the molecule, contains two regions of tandem immunoglobulin domains on either side of a PEVK region that is rich in proline, glutamate, valine and lysine. The A-band, which is thought to act as a protein-ruler, contains a mixture of immunoglobulin and fibronectin repeats, and possesses kinase activity. An N-terminal Z-disc region and a C-terminal M-line region bind to the Z-line and M-line of the sarcomere, respectively, so that a single titin molecule spans half the length of a sarcomere. Titin also contains binding sites for muscle associated proteins so it serves as an adhesion template for the assembly of contractile machinery in muscle cells. It has also been identified as a structural protein for chromosomes. Alternative splicing of this gene results in multiple transcript variants. Considerable variability exists in the I-band, the M-line and the Z-disc regions of titin. Variability in the I-band region contributes to the differences in elasticity of different titin isoforms and, therefore, to the differences in elasticity of different muscle types. Mutations in this gene are associated with familial hypertrophic cardiomyopathy 9, and autoantibodies to titin are produced in patients with the autoimmune disease scleroderma. | 7273 | ENSG00000155657 | TTN | NA |
| NA | NA | NA | ENSG00000272003 | NA | TRUE |
| NA | NA | ENSG00000254539 | ENSG00000254539 | RP4-791M13.3 | NA |
| calcium/calmodulin dependent protein kinase II beta | The product of this gene belongs to the serine/threonine protein kinase family and to the Ca(2+)/calmodulin-dependent protein kinase subfamily. Calcium signaling is crucial for several aspects of plasticity at glutamatergic synapses. In mammalian cells, the enzyme is composed of four different chains: alpha, beta, gamma, and delta. The product of this gene is a beta chain. It is possible that distinct isoforms of this chain have different cellular localizations and interact differently with calmodulin. Alternative splicing results in multiple transcript variants. | 816 | ENSG00000058404 | CAMK2B | NA |
| carbohydrate (N-acetylgalactosamine 4-sulfate 6-O) sulfotransferase 15 | Chondroitin sulfate (CS) is a glycosaminoglycan which is an important structural component of the extracellular matrix and which links to proteins to form proteoglycans. Chondroitin sulfate E (CS-E) is an isomer of chondroitin sulfate in which the C-4 and C-6 hydroxyl groups are sulfated. This gene encodes a type II transmembrane glycoprotein that acts as a sulfotransferase to transfer sulfate to the C-6 hydroxal group of chondroitin sulfate. This gene has also been identified as being co-expressed with RAG1 in B-cells and as potentially acting as a B-cell surface signaling receptor. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | 51363 | ENSG00000182022 | CHST15 | NA |
| PTPRF interacting protein alpha 4 | PPFIA4, or liprin-alpha-4, belongs to the liprin-alpha gene family. See liprin-alpha-1 (LIP1, or PPFIA1; MIM 611054) for background on liprins. | 8497 | ENSG00000143847 | PPFIA4 | NA |
| lymphocyte cytosolic protein 1 | Plastins are a family of actin-binding proteins that are conserved throughout eukaryote evolution and expressed in most tissues of higher eukaryotes. In humans, two ubiquitous plastin isoforms (L and T) have been identified. Plastin 1 (otherwise known as Fimbrin) is a third distinct plastin isoform which is specifically expressed at high levels in the small intestine. The L isoform is expressed only in hemopoietic cell lineages, while the T isoform has been found in all other normal cells of solid tissues that have replicative potential (fibroblasts, endothelial cells, epithelial cells, melanocytes, etc.). However, L-plastin has been found in many types of malignant human cells of non-hemopoietic origin suggesting that its expression is induced accompanying tumorigenesis in solid tissues. | 3936 | ENSG00000136167 | LCP1 | NA |
| ephrin A5 | Ephrin-A5, a member of the ephrin gene family, prevents axon bundling in cocultures of cortical neurons with astrocytes, a model of late stage nervous system development and differentiation. The EPH and EPH-related receptors comprise the largest subfamily of receptor protein-tyrosine kinases and have been implicated in mediating developmental events, particularly in the nervous system. EPH receptors typically have a single kinase domain and an extracellular region containing a Cys-rich domain and 2 fibronectin type III repeats. The ephrin ligands and receptors have been named by the Eph Nomenclature Committee (1997). Based on their structures and sequence relationships, ephrins are divided into the ephrin-A (EFNA) class, which are anchored to the membrane by a glycosylphosphatidylinositol linkage, and the ephrin-B (EFNB) class, which are transmembrane proteins. The Eph family of receptors are similarly divided into 2 groups based on the similarity of their extracellular domain sequences and their affinities for binding ephrin-A and ephrin-B ligands. | 1946 | ENSG00000184349 | EFNA5 | NA |
| glypican 1 | Cell surface heparan sulfate proteoglycans are composed of a membrane-associated protein core substituted with a variable number of heparan sulfate chains. Members of the glypican-related integral membrane proteoglycan family (GRIPS) contain a core protein anchored to the cytoplasmic membrane via a glycosyl phosphatidylinositol linkage. These proteins may play a role in the control of cell division and growth regulation. | 2817 | ENSG00000063660 | GPC1 | NA |
| cortexin 1 | NA | 404217 | ENSG00000178531 | CTXN1 | NA |
| FXYD domain containing ion transport regulator 6 | This gene encodes a member of the FXYD family of transmembrane proteins. This particular protein encodes phosphohippolin, which likely affects the activity of Na,K-ATPase. Multiple alternatively spliced transcript variants encoding the same protein have been described. Related pseudogenes have been identified on chromosomes 10 and X. Read-through transcripts have been observed between this locus and the downstream sodium/potassium-transporting ATPase subunit gamma (FXYD2, GeneID 486) locus. | 53826 | ENSG00000137726 | FXYD6 | NA |
| lysyl oxidase | This gene encodes a member of the lysyl oxidase family of proteins. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate a regulatory propeptide and the mature enzyme. The copper-dependent amine oxidase activity of this enzyme functions in the crosslinking of collagens and elastin, while the propeptide may play a role in tumor suppression. | 4015 | ENSG00000113083 | LOX | NA |
| inositol polyphosphate-5-phosphatase J | NA | 27124 | ENSG00000185133 | INPP5J | NA |
| prolyl 3-hydroxylase 3 | The protein encoded by this gene belongs to the leprecan family of proteoglycans, which function as collagen prolyl hydroxylases that are required for proper collagen biosynthesis, folding and assembly. This protein, like other family members, is thought to reside in the endoplasmic reticulum. Epigenetic inactivation of this gene is associated with breast and other cancers, suggesting that it may function as a tumor suppressor. | 10536 | ENSG00000110811 | P3H3 | NA |
| cystatin E/M | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins and the kininogens. The type 2 cystatin proteins are a class of cysteine proteinase inhibitors found in a variety of human fluids and secretions, where they appear to provide protective functions. This gene encodes a cystatin from the type 2 family, which is down-regulated in metastatic breast tumor cells as compared to primary tumor cells. Loss of expression is likely associated with the progression of a primary tumor to a metastatic phenotype. | 1474 | ENSG00000175315 | CST6 | NA |
| phospholipase C gamma 2 | The protein encoded by this gene is a transmembrane signaling enzyme that catalyzes the conversion of 1-phosphatidyl-1D-myo-inositol 4,5-bisphosphate to 1D-myo-inositol 1,4,5-trisphosphate (IP3) and diacylglycerol (DAG) using calcium as a cofactor. IP3 and DAG are second messenger molecules important for transmitting signals from growth factor receptors and immune system receptors across the cell membrane. Mutations in this gene have been found in autoinflammation, antibody deficiency, and immune dysregulation syndrome and familial cold autoinflammatory syndrome 3. | 5336 | ENSG00000197943 | PLCG2 | NA |
| cAMP responsive element binding protein 3 like 1 | The protein encoded by this gene is normally found in the membrane of the endoplasmic reticulum (ER). However, upon stress to the ER, the encoded protein is cleaved and the released cytoplasmic transcription factor domain translocates to the nucleus. There it activates the transcription of target genes by binding to box-B elements. | 90993 | ENSG00000157613 | CREB3L1 | NA |
| RNA, 5S ribosomal pseudogene 352 | NA | ENSG00000200278 | ENSG00000200278 | RNA5SP352 | NA |
| cysteine rich protein 2 | This gene encodes a putative transcription factor with two LIM zinc-binding domains. The encoded protein may participate in the differentiation of smooth muscle tissue. Alternative splicing results in multiple transcript variants. | 1397 | ENSG00000182809 | CRIP2 | NA |
| NA | NA | NA | ENSG00000203691 | NA | TRUE |
| kelch like family member 5 | NA | 51088 | ENSG00000109790 | KLHL5 | NA |
| myosin light chain 3 | MYL3 encodes myosin light chain 3, an alkali light chain also referred to in the literature as both the ventricular isoform and the slow skeletal muscle isoform. Mutations in MYL3 have been identified as a cause of mid-left ventricular chamber type hypertrophic cardiomyopathy. | 4634 | ENSG00000160808 | MYL3 | NA |
| filamin binding LIM protein 1 | This gene encodes a protein with an N-terminal filamin-binding domain, a central proline-rich domain, and, multiple C-terminal LIM domains. This protein localizes at cell junctions and may link cell adhesion structures to the actin cytoskeleton. This protein may be involved in the assembly and stabilization of actin-filaments and likely plays a role in modulating cell adhesion, cell morphology and cell motility. This protein also localizes to the nucleus and may affect cardiomyocyte differentiation after binding with the CSX/NKX2-5 transcription factor. Alternative splicing results in multiple transcript variants encoding different isoforms. | 54751 | ENSG00000162458 | FBLIM1 | NA |
| collagen type VII alpha 1 | This gene encodes the alpha chain of type VII collagen. The type VII collagen fibril, composed of three identical alpha collagen chains, is restricted to the basement zone beneath stratified squamous epithelia. It functions as an anchoring fibril between the external epithelia and the underlying stroma. Mutations in this gene are associated with all forms of dystrophic epidermolysis bullosa. In the absence of mutations, however, an acquired form of this disease can result from an autoimmune response made to type VII collagen. | 1294 | ENSG00000114270 | COL7A1 | NA |
| neutrophil cytosolic factor 4 | The protein encoded by this gene is a cytosolic regulatory component of the superoxide-producing phagocyte NADPH-oxidase, a multicomponent enzyme system important for host defense. This protein is preferentially expressed in cells of myeloid lineage. It interacts primarily with neutrophil cytosolic factor 2 (NCF2/p67-phox) to form a complex with neutrophil cytosolic factor 1 (NCF1/p47-phox), which further interacts with the small G protein RAC1 and translocates to the membrane upon cell stimulation. This complex then activates flavocytochrome b, the membrane-integrated catalytic core of the enzyme system. The PX domain of this protein can bind phospholipid products of the PI(3) kinase, which suggests its role in PI(3) kinase-mediated signaling events. The phosphorylation of this protein was found to negatively regulate the enzyme activity. Alternatively spliced transcript variants encoding distinct isoforms have been observed. | 4689 | ENSG00000100365 | NCF4 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",1,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[2,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| X_id | summary | name | symbol | query | notfound |
|---|---|---|---|---|---|
| 4311 | This gene encodes a common acute lymphocytic leukemia antigen that is an important cell surface marker in the diagnosis of human acute lymphocytic leukemia (ALL). This protein is present on leukemic cells of pre-B phenotype, which represent 85% of cases of ALL. This protein is not restricted to leukemic cells, however, and is found on a variety of normal tissues. It is a glycoprotein that is particularly abundant in kidney, where it is present on the brush border of proximal tubules and on glomerular epithelium. The protein is a neutral endopeptidase that cleaves peptides at the amino side of hydrophobic residues and inactivates several peptide hormones including glucagon, enkephalins, substance P, neurotensin, oxytocin, and bradykinin. This gene, which encodes a 100-kD type II transmembrane glycoprotein, exists in a single copy of greater than 45 kb. The 5’ untranslated region of this gene is alternatively spliced, resulting in four separate mRNA transcripts. The coding region is not affected by alternative splicing. | membrane metallo-endopeptidase | MME | ENSG00000196549 | NA |
| 4897 | Cell adhesion molecules (CAMs) are members of the immunoglobulin superfamily. This gene encodes a neuronal cell adhesion molecule with multiple immunoglobulin-like C2-type domains and fibronectin type-III domains. This ankyrin-binding protein is involved in neuron-neuron adhesion and promotes directional signaling during axonal cone growth. This gene is also expressed in non-neural tissues and may play a general role in cell-cell communication via signaling from its intracellular domain to the actin cytoskeleton during directional cell migration. Allelic variants of this gene have been associated with autism and addiction vulnerability. Alternative splicing results in multiple transcript variants encoding different isoforms. | neuronal cell adhesion molecule | NRCAM | ENSG00000091129 | NA |
| 2938 | This gene encodes a member of a family of enzymes that function to add glutathione to target electrophilic compounds, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. This action is an important step in detoxification of these compounds. This subfamily of enzymes has a particular role in protecting cells from reactive oxygen species and the products of peroxidation. Polymorphisms in this gene influence the ability of individuals to metabolize different drugs. This gene is located in a cluster of similar genes and pseudogenes on chromosome 6. Alternative splicing results in multiple transcript variants. | glutathione S-transferase alpha 1 | GSTA1 | ENSG00000243955 | NA |
| 2243 | This gene encodes the alpha subunit of the coagulation factor fibrinogen, which is a component of the blood clot. Following vascular injury, the encoded preproprotein is proteolytically processed by thrombin during the conversion of fibrinogen to fibrin. Mutations in this gene lead to several disorders, including dysfibrinogenemia, hypofibrinogenemia, afibrinogenemia and renal amyloidosis. Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. | fibrinogen alpha chain | FGA | ENSG00000171560 | NA |
| 259 | This gene encodes a complex glycoprotein secreted in plasma. The precursor is proteolytically processed into distinct functioning proteins: alpha-1-microglobulin, which belongs to the superfamily of lipocalin transport proteins and may play a role in the regulation of inflammatory processes, and bikunin, which is a urinary trypsin inhibitor belonging to the superfamily of Kunitz-type protease inhibitors and plays an important role in many physiological and pathological processes. This gene is located on chromosome 9 in a cluster of lipocalin genes. | alpha-1-microglobulin/bikunin precursor | AMBP | ENSG00000106927 | NA |
| 2244 | The protein encoded by this gene is the beta component of fibrinogen, a blood-borne glycoprotein comprised of three pairs of nonidentical polypeptide chains. Following vascular injury, fibrinogen is cleaved by thrombin to form fibrin which is the most abundant component of blood clots. In addition, various cleavage products of fibrinogen and fibrin regulate cell adhesion and spreading, display vasoconstrictor and chemotactic activities, and are mitogens for several cell types. Mutations in this gene lead to several disorders, including afibrinogenemia, dysfibrinogenemia, hypodysfibrinogenemia and thrombotic tendency. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | fibrinogen beta chain | FGB | ENSG00000171564 | NA |
| ENSG00000254902 | NA | ANO1 antisense RNA 1 | ANO1-AS1 | ENSG00000254902 | NA |
| 10912 | This gene is a member of a group of genes whose transcript levels are increased following stressful growth arrest conditions and treatment with DNA-damaging agents. The protein encoded by this gene responds to environmental stresses by mediating activation of the p38/JNK pathway via MTK1/MEKK4 kinase. The GADD45G is highly expressed in placenta. | growth arrest and DNA damage inducible gamma | GADD45G | ENSG00000130222 | NA |
| 8470 | Arg and c-Abl represent the mammalian members of the Abelson family of non-receptor protein-tyrosine kinases. They interact with the Arg/Abl binding proteins via the SH3 domains present in the carboxy end of the latter group of proteins. This gene encodes the sorbin and SH3 domain containing 2 protein. It has three C-terminal SH3 domains and an N-terminal sorbin homology (SoHo) domain that interacts with lipid raft proteins. The subcellular localization of this protein in epithelial and cardiac muscle cells suggests that it functions as an adapter protein to assemble signaling complexes in stress fibers, and that it is a potential link between Abl family kinases and the actin cytoskeleton. Alternative splicing results in multiple transcript variants encoding different isoforms. | sorbin and SH3 domain containing 2 | SORBS2 | ENSG00000154556 | NA |
| 213 | Albumin is a soluble, monomeric protein which comprises about one-half of the blood serum protein. Albumin functions primarily as a carrier protein for steroids, fatty acids, and thyroid hormones and plays a role in stabilizing extracellular fluid volume. Albumin is a globular unglycosylated serum protein of molecular weight 65,000. Albumin is synthesized in the liver as preproalbumin which has an N-terminal peptide that is removed before the nascent protein is released from the rough endoplasmic reticulum. The product, proalbumin, is in turn cleaved in the Golgi vesicles to produce the secreted albumin. | albumin | ALB | ENSG00000163631 | NA |
| 78989 | This gene encodes a member of the collectin family of C-type lectins that possess collagen-like sequences and carbohydrate recognition domains. Collectins are secreted proteins that play important roles in the innate immune system by binding to carbohydrate antigens on microorganisms, facilitating their recognition and removal. The encoded protein binds to multiple sugars with a preference for fucose and mannose. Mutations in this gene are a cause of 3MC syndrome-2. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | collectin subfamily member 11 | COLEC11 | ENSG00000118004 | NA |
| ENSG00000225670 | NA | CADM3 antisense RNA 1 | CADM3-AS1 | ENSG00000225670 | NA |
| 7448 | The protein encoded by this gene is a member of the pexin family. It is found in serum and tissues and promotes cell adhesion and spreading, inhibits the membrane-damaging effect of the terminal cytolytic complement pathway, and binds to several serpin serine protease inhibitors. It is a secreted protein and exists in either a single chain form or a clipped, two chain form held together by a disulfide bond. | vitronectin | VTN | ENSG00000109072 | NA |
| 350 | Apolipoprotein H has been implicated in a variety of physiologic pathways including lipoprotein metabolism, coagulation, and the production of antiphospholipid autoantibodies. APOH may be a required cofactor for anionic phospholipid binding by the antiphospholipid autoantibodies found in sera of many patients with lupus and primary antiphospholipid syndrome, but it does not seem to be required for the reactivity of antiphospholipid autoantibodies associated with infections. | apolipoprotein H | APOH | ENSG00000091583 | NA |
| 23498 | 3-Hydroxyanthranilate 3,4-dioxygenase is a monomeric cytosolic protein belonging to the family of intramolecular dioxygenases containing nonheme ferrous iron. It is widely distributed in peripheral organs, such as liver and kidney, and is also present in low amounts in the central nervous system. HAAO catalyzes the synthesis of quinolinic acid (QUIN) from 3-hydroxyanthranilic acid. QUIN is an excitotoxin whose toxicity is mediated by its ability to activate glutamate N-methyl-D-aspartate receptors. Increased cerebral levels of QUIN may participate in the pathogenesis of neurologic and inflammatory disorders. HAAO has been suggested to play a role in disorders associated with altered tissue levels of QUIN. | 3-hydroxyanthranilate 3,4-dioxygenase | HAAO | ENSG00000162882 | NA |
| 100873993 | NA | ITIH4 antisense RNA 1 | ITIH4-AS1 | ENSG00000239799 | NA |
| 57863 | IGSF4B is a brain-specific protein related to the calcium-independent cell-cell adhesion molecules known as nectins (see PVRL3; MIM 607147) (Kakunaga et al., 2005 [PubMed 15741237]). | cell adhesion molecule 3 | CADM3 | ENSG00000162706 | NA |
| 1571 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum and is induced by ethanol, the diabetic state, and starvation. The enzyme metabolizes both endogenous substrates, such as ethanol, acetone, and acetal, as well as exogenous substrates including benzene, carbon tetrachloride, ethylene glycol, and nitrosamines which are premutagens found in cigarette smoke. Due to its many substrates, this enzyme may be involved in such varied processes as gluconeogenesis, hepatic cirrhosis, diabetes, and cancer. | cytochrome P450 family 2 subfamily E member 1 | CYP2E1 | ENSG00000130649 | NA |
| 9435 | This locus encodes a sulfotransferase protein. The encoded enzyme catalyzes the sulfation of a nonreducing N-acetylglucosamine residue, and may play a role in biosynthesis of 6-sulfosialyl Lewis X antigen. | carbohydrate sulfotransferase 2 | CHST2 | ENSG00000175040 | NA |
| 5507 | This gene encodes a regulatory subunit of protein phosphatase-1 (PP1). PP1 catalyzes reversible protein phosphorylation, which is important in a wide range of cellular activities: neuronal, muscular, RNA splicing, protein synthesis, cell death, and glycogen metabolism, to name just a few. By interacting with different regulatory subunits, PP1 is directed to different parts of the cell, to different substrates, or to respond to extracellular signals. | protein phosphatase 1 regulatory subunit 3C | PPP1R3C | ENSG00000119938 | NA |
| 2266 | The protein encoded by this gene is the gamma component of fibrinogen, a blood-borne glycoprotein comprised of three pairs of nonidentical polypeptide chains. Following vascular injury, fibrinogen is cleaved by thrombin to form fibrin which is the most abundant component of blood clots. In addition, various cleavage products of fibrinogen and fibrin regulate cell adhesion and spreading, display vasoconstrictor and chemotactic activities, and are mitogens for several cell types. Mutations in this gene lead to several disorders, including dysfibrinogenemia, hypofibrinogenemia and thrombophilia. Alternative splicing results in transcript variants encoding different isoforms. | fibrinogen gamma chain | FGG | ENSG00000171557 | NA |
| 335 | This gene encodes apolipoprotein A-I, which is the major protein component of high density lipoprotein (HDL) in plasma. The encoded preproprotein is proteolytically processed to generate the mature protein, which promotes cholesterol efflux from tissues to the liver for excretion, and is a cofactor for lecithin cholesterolacyltransferase (LCAT), an enzyme responsible for the formation of most plasma cholesteryl esters. This gene is closely linked with two other apolipoprotein genes on chromosome 11. Defects in this gene are associated with HDL deficiencies, including Tangier disease, and with systemic non-neuropathic amyloidosis. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein. | apolipoprotein A1 | APOA1 | ENSG00000118137 | NA |
| 81035 | This gene encodes a member of the C-lectin family, proteins that possess collagen-like sequences and carbohydrate recognition domains. This protein is a scavenger receptor, a cell surface glycoprotein that displays several functions associated with host defense. It can bind to carbohydrate antigens on microorganisms, facilitating their recognition and removal. It also mediates the recognition, internalization, and degradation of oxidatively modified low density lipoprotein by vascular endothelial cells. | collectin subfamily member 12 | COLEC12 | ENSG00000158270 | NA |
| ENSG00000271857 | NA | NA | RP1-244F24.1 | ENSG00000271857 | NA |
| 148534 | NA | transmembrane protein 56 | TMEM56 | ENSG00000152078 | NA |
| 7439 | This gene encodes a member of the bestrophin gene family. This small gene family is characterized by proteins with a highly conserved N-terminus with four to six transmembrane domains. Bestrophins may form chloride ion channels or may regulate voltage-gated L-type calcium-ion channels. Bestrophins are generally believed to form calcium-activated chloride-ion channels in epithelial cells but they have also been shown to be highly permeable to bicarbonate ion transport in retinal tissue. Mutations in this gene are responsible for juvenile-onset vitelliform macular dystrophy (VMD2), also known as Best macular dystrophy, in addition to adult-onset vitelliform macular dystrophy (AVMD) and other retinopathies. Alternative splicing results in multiple variants encoding distinct isoforms. | bestrophin 1 | BEST1 | ENSG00000167995 | NA |
| ENSG00000232815 | NA | double homeobox 4 like 50, pseudogene | DUX4L50 | ENSG00000232815 | NA |
| 10809 | NA | StAR related lipid transfer domain containing 10 | STARD10 | ENSG00000214530 | NA |
| 10098 | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. | tetraspanin 5 | TSPAN5 | ENSG00000168785 | NA |
| 345 | Apolipoprotein C-III is a very low density lipoprotein (VLDL) protein. APOC3 inhibits lipoprotein lipase and hepatic lipase; it is thought to delay catabolism of triglyceride-rich particles. The APOA1, APOC3 and APOA4 genes are closely linked in both rat and human genomes. The A-I and A-IV genes are transcribed from the same strand, while the A-1 and C-III genes are convergently transcribed. An increase in apoC-III levels induces the development of hypertriglyceridemia. | apolipoprotein C3 | APOC3 | ENSG00000110245 | NA |
| 229 | Fructose-1,6-bisphosphate aldolase (EC 4.1.2.13) is a tetrameric glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Vertebrates have 3 aldolase isozymes which are distinguished by their electrophoretic and catalytic properties. Differences indicate that aldolases A, B, and C are distinct proteins, the products of a family of related ‘housekeeping’ genes exhibiting developmentally regulated expression of the different isozymes. The developing embryo produces aldolase A, which is produced in even greater amounts in adult muscle where it can be as much as 5% of total cellular protein. In adult liver, kidney and intestine, aldolase A expression is repressed and aldolase B is produced. In brain and other nervous tissue, aldolase A and C are expressed about equally. There is a high degree of homology between aldolase A and C. Defects in ALDOB cause hereditary fructose intolerance. | aldolase, fructose-bisphosphate B | ALDOB | ENSG00000136872 | NA |
| 3242 | The protein encoded by this gene is an enzyme in the catabolic pathway of tyrosine. The encoded protein catalyzes the conversion of 4-hydroxyphenylpyruvate to homogentisate. Defects in this gene are a cause of tyrosinemia type 3 (TYRO3) and hawkinsinuria (HAWK). Two transcript variants encoding different isoforms have been found for this gene. | 4-hydroxyphenylpyruvate dioxygenase | HPD | ENSG00000158104 | NA |
| 104326055 | NA | APOA1 antisense RNA | APOA1-AS | ENSG00000235910 | NA |
| 57168 | NA | aspartate beta-hydroxylase domain containing 2 | ASPHD2 | ENSG00000128203 | NA |
| 3699 | This gene encodes the heavy chain subunit of the pre-alpha-trypsin inhibitor complex. This complex may stabilize the extracellular matrix through its ability to bind hyaluronic acid. Polymorphisms of this gene may be associated with increased risk for schizophrenia and major depressive disorder. This gene is present in an inter-alpha-trypsin inhibitor family gene cluster on chromosome 3. | inter-alpha-trypsin inhibitor heavy chain 3 | ITIH3 | ENSG00000162267 | NA |
| 80714 | This gene encodes a member of the pre-B cell leukemia transcription factor family. These proteins are homeobox proteins that play critical roles in embryonic development and cellular differentiation both as Hox cofactors and through Hox-independent pathways. The encoded protein contains a homeobox DNA-binding domain, but specific functions of the protein have not been determined. Alternatively spliced transcript variants have been observed for this gene. | PBX homeobox 4 | PBX4 | ENSG00000105717 | NA |
| 338773 | NA | transmembrane protein 119 | TMEM119 | ENSG00000183160 | NA |
| 5376 | This gene encodes an integral membrane protein that is a major component of myelin in the peripheral nervous system. Studies suggest two alternately used promoters drive tissue-specific expression. Various mutations of this gene are causes of Charcot-Marie-Tooth disease Type IA, Dejerine-Sottas syndrome, and hereditary neuropathy with liability to pressure palsies. Alternative splicing results in multiple transcript variants. | peripheral myelin protein 22 | PMP22 | ENSG00000109099 | NA |
| 183 | The protein encoded by this gene, pre-angiotensinogen or angiotensinogen precursor, is expressed in the liver and is cleaved by the enzyme renin in response to lowered blood pressure. The resulting product, angiotensin I, is then cleaved by angiotensin converting enzyme (ACE) to generate the physiologically active enzyme angiotensin II. The protein is involved in maintaining blood pressure and in the pathogenesis of essential hypertension and preeclampsia. Mutations in this gene are associated with susceptibility to essential hypertension, and can cause renal tubular dysgenesis, a severe disorder of renal tubular development. Defects in this gene have also been associated with non-familial structural atrial fibrillation, and inflammatory bowel disease. | angiotensinogen | AGT | ENSG00000135744 | NA |
| 1558 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum and its expression is induced by phenobarbital. The enzyme is known to metabolize many xenobiotics, including the anticonvulsive drug mephenytoin, benzo(a)pyrene, 7-ethyoxycoumarin, and the anti-cancer drug taxol. This gene is located within a cluster of cytochrome P450 genes on chromosome 10q24. Several transcript variants encoding a few different isoforms have been found for this gene. | cytochrome P450 family 2 subfamily C member 8 | CYP2C8 | ENSG00000138115 | NA |
| ENSG00000266844 | NA | NA | RP11-862L9.3 | ENSG00000266844 | NA |
| 23406 | This gene encodes one of the numerous actin-binding proteins which regulate the actin cytoskeleton. This protein binds F-actin, and also interacts with 5-lipoxygenase, which is the first committed enzyme in leukotriene biosynthesis. Although this gene has been reported to map to chromosome 17 in the Smith-Magenis syndrome region, the best alignments for this gene are to chromosome 16. The Smith-Magenis syndrome region is the site of two related pseudogenes. | coactosin like F-actin binding protein 1 | COTL1 | ENSG00000103187 | NA |
| 7070 | This gene encodes a cell surface glycoprotein and member of the immunoglobulin superfamily of proteins. The encoded protein is involved in cell adhesion and cell communication in numerous cell types, but particularly in cells of the immune and nervous systems. The encoded protein is widely used as a marker for hematopoietic stem cells. This gene may function as a tumor suppressor in nasopharyngeal carcinoma. Alternative splicing results in multiple transcript variants. | Thy-1 cell surface antigen | THY1 | ENSG00000154096 | NA |
| 4974 | NA | oligodendrocyte myelin glycoprotein | OMG | ENSG00000126861 | NA |
| 388849 | NA | coiled-coil domain containing 188 | CCDC188 | ENSG00000234409 | NA |
| 1368 | The protein encoded by this gene is a membrane-bound arginine/lysine carboxypeptidase. Its expression is associated with monocyte to macrophage differentiation. This encoded protein contains hydrophobic regions at the amino and carboxy termini and has 6 potential asparagine-linked glycosylation sites. The active site residues of carboxypeptidases A and B are conserved in this protein. Three alternatively spliced transcript variants encoding the same protein have been described for this gene. | carboxypeptidase M | CPM | ENSG00000135678 | NA |
| 55890 | The protein encoded by this gene is a member of the type 3 G protein-coupled receptor family. Members of this superfamily are characterized by a signature 7-transmembrane domain motif. The specific function of this protein is unknown; however, this protein may mediate the cellular effects of retinoic acid on the G protein signal transduction cascade. Two transcript variants encoding different isoforms have been found for this gene. | G protein-coupled receptor class C group 5 member C | GPRC5C | ENSG00000170412 | NA |
| 22824 | The protein encoded by this gene is heat shock inducible and may act as a chaperone. The encoded protein can protect the heat-shocked cell against the harmful effects of aggregated proteins. This gene is highly expressed in leukemia cells and may be a good target for therapeutic intervention. Several transcripts encoding different isoforms have been found for this gene. | heat shock protein family A (Hsp70) member 4 like | HSPA4L | ENSG00000164070 | NA |
| 23554 | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. | tetraspanin 12 | TSPAN12 | ENSG00000106025 | NA |
| 84952 | This gene encodes a member of the cingulin family. The encoded protein localizes to both adherens and tight cell-cell junctions and mediates junction assembly and maintenance by regulating the activity of the small GTPases RhoA and Rac1. Heterozygous chromosomal rearrangements resulting in association of the promoter for this gene with the aromatase gene are a cause of aromatase excess syndrome. Alternatively spliced transcript variants have been observed for this gene. | cingulin-like 1 | CGNL1 | ENSG00000128849 | NA |
| ENSG00000263873 | NA | NA | RP11-334E6.12 | ENSG00000263873 | NA |
| 112817 | The authors of PMID:20797690 cloned this gene while searching for genes in a region of chromosome 10 linked to primary hyperoxalurea type III. They noted that even though the encoded protein has been described as a mitochondrial dihydrodipicolinate synthase-like enzyme, it shares little homology with E. coli dihydrodipicolinate synthase (Dhdps), particularly in the putative substrate-binding region. Moreover, neither lysine biosynthesis nor sialic acid metabolism, for which Dhdps is responsible, occurs in vertebrate mitochondria. They propose that this gene encodes mitochondrial 4-hydroxyl-2-oxoglutarate aldolase (EC 4.1.3.16), which catalyzes the final step in the metabolic pathway of hydroxyproline, releasing glyoxylate and pyruvate. This gene is predominantly expressed in the liver and kidney, and mutations in this gene are found in patients with primary hyperoxalurea type III. Alternatively spliced transcript variants encoding different isoforms have been noted for this gene. | 4-hydroxy-2-oxoglutarate aldolase 1 | HOGA1 | ENSG00000241935 | NA |
| 29984 | Ras homolog, or Rho, proteins interact with protein kinases and may serve as targets for activated GTPase. They play a critical role in muscle differentiation. The protein encoded by this gene binds GTP and is a member of the small GTPase superfamily. It is involved in endosome dynamics and reorganization of the actin cytoskeleton, and it may coordinate membrane transport with the function of the cytoskeleton. Two transcript variants encoding different isoforms have been found for this gene. | ras homolog family member D | RHOD | ENSG00000173156 | NA |
| NA | NA | NA | NA | ENSG00000255824 | TRUE |
| 336 | This gene encodes apolipoprotein (apo-) A-II, which is the second most abundant protein of the high density lipoprotein particles. The protein is found in plasma as a monomer, homodimer, or heterodimer with apolipoprotein D. Defects in this gene may result in apolipoprotein A-II deficiency or hypercholesterolemia. | apolipoprotein A2 | APOA2 | ENSG00000158874 | NA |
| 115908 | This locus encodes a protein that may play a role in the cellular response to arterial injury through involvement in vascular remodeling. Mutations at this locus have been associated with Barrett esophagus and esophageal adenocarcinoma. Alternatively spliced transcript variants have been described. | collagen triple helix repeat containing 1 | CTHRC1 | ENSG00000164932 | NA |
| 9536 | The protein encoded by this gene is a glutathione-dependent prostaglandin E synthase. The expression of this gene has been shown to be induced by proinflammatory cytokine interleukin 1 beta (IL1B). Its expression can also be induced by tumor suppressor protein TP53, and may be involved in TP53 induced apoptosis. Knockout studies in mice suggest that this gene may contribute to the pathogenesis of collagen-induced arthritis and mediate acute pain during inflammatory responses. | prostaglandin E synthase | PTGES | ENSG00000148344 | NA |
| 3557 | The protein encoded by this gene is a member of the interleukin 1 cytokine family. This protein inhibits the activities of interleukin 1, alpha (IL1A) and interleukin 1, beta (IL1B), and modulates a variety of interleukin 1 related immune and inflammatory responses. This gene and five other closely related cytokine genes form a gene cluster spanning approximately 400 kb on chromosome 2. A polymorphism of this gene is reported to be associated with increased risk of osteoporotic fractures and gastric cancer. Several alternatively spliced transcript variants encoding distinct isoforms have been reported. | interleukin 1 receptor antagonist | IL1RN | ENSG00000136689 | NA |
| 100507392 | NA | smooth muscle and endothelial cell enriched migration/differentiation-associated long non-coding RNA | SENCR | ENSG00000254703 | NA |
| 26251 | Voltage-gated potassium (Kv) channels represent the most complex class of voltage-gated ion channels from both functional and structural standpoints. Their diverse functions include regulating neurotransmitter release, heart rate, insulin secretion, neuronal excitability, epithelial electrolyte transport, smooth muscle contraction, and cell volume. This gene encodes a member of the potassium channel, voltage-gated, subfamily G. This member is a gamma subunit of the voltage-gated potassium channel. The delayed-rectifier type channels containing this subunit may contribute to cardiac action potential repolarization. | potassium voltage-gated channel modifier subfamily G member 2 | KCNG2 | ENSG00000178342 | NA |
| 1401 | The protein encoded by this gene belongs to the pentaxin family. It is involved in several host defense related functions based on its ability to recognize foreign pathogens and damaged cells of the host and to initiate their elimination by interacting with humoral and cellular effector systems in the blood. Consequently, the level of this protein in plasma increases greatly during acute phase response to tissue injury, infection, or other inflammatory stimuli. | C-reactive protein, pentraxin-related | CRP | ENSG00000132693 | NA |
| 253982 | NA | aspartate beta-hydroxylase domain containing 1 | ASPHD1 | ENSG00000174939 | NA |
| 2053 | This gene encodes a member of the epoxide hydrolase family. The protein, found in both the cytosol and peroxisomes, binds to specific epoxides and converts them to the corresponding dihydrodiols. Mutations in this gene have been associated with familial hypercholesterolemia. Alternatively spliced transcript variants have been described. | epoxide hydrolase 2 | EPHX2 | ENSG00000120915 | NA |
| ENSG00000271833 | NA | NA | RP11-356B19.11 | ENSG00000271833 | NA |
| 10788 | This gene encodes a member of the IQGAP family. The protein contains three IQ domains, one calponin homology domain, one Ras-GAP domain and one WW domain. It interacts with components of the cytoskeleton, with cell adhesion molecules, and with several signaling molecules to regulate cell morphology and motility. | IQ motif containing GTPase activating protein 2 | IQGAP2 | ENSG00000145703 | NA |
| 84842 | NA | 4-hydroxyphenylpyruvate dioxygenase like | HPDL | ENSG00000186603 | NA |
| 1636 | This gene encodes an enzyme involved in catalyzing the conversion of angiotensin I into a physiologically active peptide angiotensin II. Angiotensin II is a potent vasopressor and aldosterone-stimulating peptide that controls blood pressure and fluid-electrolyte balance. This enzyme plays a key role in the renin-angiotensin system. Many studies have associated the presence or absence of a 287 bp Alu repeat element in this gene with the levels of circulating enzyme or cardiovascular pathophysiologies. Multiple alternatively spliced transcript variants encoding different isoforms have been identified, and two most abundant spliced variants encode the somatic form and the testicular form, respectively, that are equally active. | angiotensin I converting enzyme | ACE | ENSG00000159640 | NA |
| 347 | This gene encodes a component of high density lipoprotein that has no marked similarity to other apolipoprotein sequences. It has a high degree of homology to plasma retinol-binding protein and other members of the alpha 2 microglobulin protein superfamily of carrier proteins, also known as lipocalins. This glycoprotein is closely associated with the enzyme lecithin:cholesterol acyltransferase - an enzyme involved in lipoprotein metabolism. | apolipoprotein D | APOD | ENSG00000189058 | NA |
| 27165 | The protein encoded by this gene is a mitochondrial phosphate-activated glutaminase that catalyzes the hydrolysis of glutamine to stoichiometric amounts of glutamate and ammonia. Originally thought to be liver-specific, this protein has been found in other tissues as well. Alternative splicing results in multiple transcript variants that encode different isoforms. | glutaminase 2 | GLS2 | ENSG00000135423 | NA |
| 3856 | This gene is a member of the type II keratin family clustered on the long arm of chromosome 12. Type I and type II keratins heteropolymerize to form intermediate-sized filaments in the cytoplasm of epithelial cells. The product of this gene typically dimerizes with keratin 18 to form an intermediate filament in simple single-layered epithelial cells. This protein plays a role in maintaining cellular structural integrity and also functions in signal transduction and cellular differentiation. Mutations in this gene cause cryptogenic cirrhosis. Alternatively spliced transcript variants have been found for this gene. | keratin 8 | KRT8 | ENSG00000170421 | NA |
| 5569 | The protein encoded by this gene is a member of the cAMP-dependent protein kinase (PKA) inhibitor family. This protein was demonstrated to interact with and inhibit the activities of both C alpha and C beta catalytic subunits of the PKA. Alternatively spliced transcript variants encoding the same protein have been reported. | protein kinase (cAMP-dependent, catalytic) inhibitor alpha | PKIA | ENSG00000171033 | NA |
| 55711 | This gene belongs to the short chain dehydrogenase/reductase superfamily. It encodes a reductase enzyme involved in the first step of wax biosynthesis wherein fatty acids are converted to fatty alcohols. The encoded peroxisomal protein utilizes saturated fatty acids of 16 or 18 carbons as preferred substrates. Alternatively spliced transcript variants have been observed for this gene. Related pseudogenes have been identified on chromosomes 2, 14 and 22. | fatty acyl-CoA reductase 2 | FAR2 | ENSG00000064763 | NA |
| 11185 | N-methylation of endogenous and xenobiotic compounds is a major method by which they are degraded. This gene encodes an enzyme that N-methylates indoles such as tryptamine. Alternative splicing results in multiple transcript variants. Read-through transcription also exists between this gene and the downstream FAM188B (family with sequence similarity 188, member B) gene. | indolethylamine N-methyltransferase | INMT | ENSG00000241644 | NA |
| 165215 | NA | family with sequence similarity 171 member B | FAM171B | ENSG00000144369 | NA |
| 1123 | This gene encodes GTPase-activating protein for ras-related p21-rac and a phorbol ester receptor. It is predominantly expressed in neurons, and plays an important role in neuronal signal-transduction mechanisms. Mutations in this gene are associated with Duane’s retraction syndrome 2 (DURS2). Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | chimerin 1 | CHN1 | ENSG00000128656 | NA |
| 10435 | CDC42, a small Rho GTPase, regulates the formation of F-actin-containing structures through its interaction with the downstream effector proteins. The protein encoded by this gene is a member of the Borg family of CDC42 effector proteins. Borg family proteins contain a CRIB (Cdc42/Rac interactive-binding) domain. They bind to, and negatively regulate the function of CDC42. Coexpression of this protein with CDC42 suggested a role of this protein in actin filament assembly and cell shape control. | CDC42 effector protein 2 | CDC42EP2 | ENSG00000149798 | NA |
| 150946 | NA | GRB2 associated regulator of MAPK1 subtype 2 | GAREM2 | ENSG00000157833 | NA |
| 4644 | This gene is one of three myosin V heavy-chain genes, belonging to the myosin gene superfamily. Myosin V is a class of actin-based motor proteins involved in cytoplasmic vesicle transport and anchorage, spindle-pole alignment and mRNA translocation. The protein encoded by this gene is abundant in melanocytes and nerve cells. Mutations in this gene cause Griscelli syndrome type-1 (GS1), Griscelli syndrome type-3 (GS3) and neuroectodermal melanolysosomal disease, or Elejalde disease. Multiple alternatively spliced transcript variants encoding different isoforms have been reported, but the full-length nature of some variants has not been determined. | myosin VA | MYO5A | ENSG00000197535 | NA |
| 79772 | NA | multiple C2 and transmembrane domain containing 1 | MCTP1 | ENSG00000175471 | NA |
| 25894 | The protein encoded by this gene can function as a guanine nucleotide exchange factor (GEF) and may play a role in intracellular signaling and cytoskeleton dynamics at the Golgi apparatus. Polymorphisms in the region of this gene have been found to be associated with spinocerebellar ataxia in some study populations. Alternative splicing results in multiple transcript variants. | pleckstrin homology and RhoGEF domain containing G4 | PLEKHG4 | ENSG00000196155 | NA |
| 101928158 | NA | LAMA5 antisense RNA 1 | LAMA5-AS1 | ENSG00000228812 | NA |
| 2534 | This gene is a member of the protein-tyrosine kinase oncogene family. It encodes a membrane-associated tyrosine kinase that has been implicated in the control of cell growth. The protein associates with the p85 subunit of phosphatidylinositol 3-kinase and interacts with the fyn-binding protein. Alternatively spliced transcript variants encoding distinct isoforms exist. | FYN proto-oncogene, Src family tyrosine kinase | FYN | ENSG00000010810 | NA |
| 22821 | This gene encodes a protein that binds inositol 1,3,4,5-tetrakisphosphate and stimulates the GTPase activity of Ras p21. This protein functions as a negative regulator of the Ras signalling pathway. It is localized to the cell membrane via a pleckstrin homology (PH) domain in the C-terminal region. Alternative splicing results in multiple transcript variants. | RAS p21 protein activator 3 | RASA3 | ENSG00000185989 | NA |
| 109 | This gene encodes adenylyl cyclase 3 which is a membrane-associated enzyme and catalyzes the formation of the secondary messenger cyclic adenosine monophosphate (cAMP). This protein appears to be widely expressed in various human tissues and may be involved in a number of physiological and pathophysiological metabolic processes. Two transcript variants encoding different isoforms have been found for this gene. | adenylate cyclase 3 | ADCY3 | ENSG00000138031 | NA |
| 3984 | There are approximately 40 known eukaryotic LIM proteins, so named for the LIM domains they contain. LIM domains are highly conserved cysteine-rich structures containing 2 zinc fingers. Although zinc fingers usually function by binding to DNA or RNA, the LIM motif probably mediates protein-protein interactions. LIM kinase-1 and LIM kinase-2 belong to a small subfamily with a unique combination of 2 N-terminal LIM motifs and a C-terminal protein kinase domain. LIMK1 is a serine/threonine kinase that regulates actin polymerization via phosphorylation and inactivation of the actin binding factor cofilin. This protein is ubiquitously expressed during development and plays a role in many cellular processes associated with cytoskeletal structure. This protein also stimulates axon growth and may play a role in brain development. LIMK1 hemizygosity is implicated in the impaired visuospatial constructive cognition of Williams syndrome. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | LIM domain kinase 1 | LIMK1 | ENSG00000106683 | NA |
| 9057 | NA | solute carrier family 7 member 6 | SLC7A6 | ENSG00000103064 | NA |
| 83875 | This gene encodes an enzyme which oxidizes carotenoids such as beta-carotene during the biosynthesis of vitamin A. Multiple transcript variants encoding different isoforms have been found for this gene. | beta-carotene oxygenase 2 | BCO2 | ENSG00000197580 | NA |
| 5320 | The protein encoded by this gene is a member of the phospholipase A2 family (PLA2). PLA2s constitute a diverse family of enzymes with respect to sequence, function, localization, and divalent cation requirements. This gene product belongs to group II, which contains secreted form of PLA2, an extracellular enzyme that has a low molecular mass and requires calcium ions for catalysis. It catalyzes the hydrolysis of the sn-2 fatty acid acyl ester bond of phosphoglycerides, releasing free fatty acids and lysophospholipids, and thought to participate in the regulation of the phospholipid metabolism in biomembranes. Several alternatively spliced transcript variants with different 5’ UTRs have been found for this gene. | phospholipase A2 group IIA | PLA2G2A | ENSG00000188257 | NA |
| 84627 | This gene encodes a zinc-finger protein. Low-percent homology to certain collagens suggests that it may function as a transcription factor or extra-nuclear regulator factor for the synthesis or organization of collagen fibers. Mutations in this gene cause brittle cornea syndrome. | zinc finger protein 469 | ZNF469 | ENSG00000225614 | NA |
| 114804 | NA | ring finger protein 157 | RNF157 | ENSG00000141576 | NA |
| NA | NA | NA | NA | ENSG00000272016 | TRUE |
| 85464 | This gene encodes a protein tyrosine phosphatase that plays a key role in the regulation of actin filaments. The encoded protein dephosphorylates and activates cofilin, which promotes actin filament depolymerization. Alternative splicing results in multiple transcript variants. | slingshot protein phosphatase 2 | SSH2 | ENSG00000141298 | NA |
| 9196 | This gene encodes a member of the potassium channel, voltage-gated, shaker-related subfamily. The encoded protein is one of the beta subunits, which are auxiliary proteins associating with functional Kv-alpha subunits. The encoded protein forms a heterodimer with the potassium voltage-gated channel, shaker-related subfamily, member 5 gene product and regulates the activity of the alpha subunit. | potassium voltage-gated channel subfamily A regulatory beta subunit 3 | KCNAB3 | ENSG00000170049 | NA |
| 283375 | The protein encoded by this gene belongs to the ZIP family of zinc transporters that transport zinc into cells from outside, and play a crucial role in controlling intracellular zinc levels. Zinc is an essential cofactor for many enzymes and proteins involved in gene transcription, growth, development and differentiation. Mutations in this gene have been associated with autosomal dominant high myopia (MYP24). Alternatively spliced transcript variants have been found for this gene. | solute carrier family 39 member 5 | SLC39A5 | ENSG00000139540 | NA |
| 84909 | This gene encodes a member of the M1 zinc aminopeptidase family. The encoded protein is a zinc-dependent metallopeptidase that catalyzes the removal of an amino acid from the amino terminus of a protein or peptide. This protein may play a role in the generation of angiotensin IV. Alternate splicing results in multiple transcript variants. | chromosome 9 open reading frame 3 | C9orf3 | ENSG00000148120 | NA |
| 2628 | This gene encodes a mitochondrial enzyme that belongs to the amidinotransferase family. This enzyme is involved in creatine biosynthesis, whereby it catalyzes the transfer of a guanido group from L-arginine to glycine, resulting in guanidinoacetic acid, the immediate precursor of creatine. Mutations in this gene cause arginine:glycine amidinotransferase deficiency, an inborn error of creatine synthesis characterized by mental retardation, language impairment, and behavioral disorders. | glycine amidinotransferase | GATM | ENSG00000171766 | NA |
| 10659 | Members of the CELF/BRUNOL protein family contain two N-terminal RNA recognition motif (RRM) domains, one C-terminal RRM domain, and a divergent segment of 160-230 aa between the second and third RRM domains. Members of this protein family regulate pre-mRNA alternative splicing and may also be involved in mRNA editing, and translation. Alternative splicing results in multiple transcript variants encoding different isoforms. | CUGBP, Elav-like family member 2 | CELF2 | ENSG00000048740 | NA |
| 10630 | This gene encodes a type-I integral membrane glycoprotein with diverse distribution in human tissues. The physiological function of this protein may be related to its mucin-type character. The homologous protein in other species has been described as a differentiation antigen and influenza-virus receptor. The specific function of this protein has not been determined but it has been proposed as a marker of lung injury. Alternatively spliced transcript variants encoding different isoforms have been identified. | podoplanin | PDPN | ENSG00000162493 | NA |
| 81932 | NA | haloacid dehalogenase like hydrolase domain containing 3 | HDHD3 | ENSG00000119431 | NA |
| 5010 | This gene encodes a member of the claudin family. Claudins are integral membrane proteins and components of tight junction strands. Tight junction strands serve as a physical barrier to prevent solutes and water from passing freely through the paracellular space between epithelial or endothelial cell sheets, and also play critical roles in maintaining cell polarity and signal transductions. The protein encoded by this gene is a major component of central nervous system (CNS) myelin and plays an important role in regulating proliferation and migration of oligodendrocytes. Mouse studies showed that the gene deficiency results in deafness and loss of the Sertoli cell epithelial phenotype in the testis. This protein is a tight junction protein at the human blood-testis barrier (BTB), and the BTB disruption is related to a dysfunction of this gene. Alternatively spliced transcript variants encoding different isoforms have been identified. | claudin 11 | CLDN11 | ENSG00000013297 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",2,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[3,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| name | X_id | symbol | query | summary | notfound |
|---|---|---|---|---|---|
| UBE2R2 antisense RNA 1 | ENSG00000235481 | UBE2R2-AS1 | ENSG00000235481 | NA | NA |
| myelin regulatory factor | 745 | MYRF | ENSG00000124920 | This gene encodes a transcription factor that is required for central nervous system myelination and may regulate oligodendrocyte differentiation. It is thought to act by increasing the expression of genes that effect myelin production but may also directly promote myelin gene expression. Loss of a similar gene in mouse models results in severe demyelination. Alternative splicing results in multiple transcript variants. | NA |
| protease, serine 3 | 5646 | PRSS3 | ENSG00000010438 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is expressed in the brain and pancreas and is resistant to common trypsin inhibitors. It is active on peptide linkages involving the carboxyl group of lysine or arginine. This gene is localized to the locus of T cell receptor beta variable orphans on chromosome 9. Four transcript variants encoding different isoforms have been described for this gene. | NA |
| glial fibrillary acidic protein | 2670 | GFAP | ENSG00000131095 | This gene encodes one of the major intermediate filament proteins of mature astrocytes. It is used as a marker to distinguish astrocytes from other glial cells during development. Mutations in this gene cause Alexander disease, a rare disorder of astrocytes in the central nervous system. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | NA |
| gastric inhibitory polypeptide receptor | 2696 | GIPR | ENSG00000010310 | This gene encodes a G-protein coupled receptor for gastric inhibitory polypeptide (GIP), which was originally identified as an activity in gut extracts that inhibited gastric acid secretion and gastrin release, but subsequently was demonstrated to stimulate insulin release in the presence of elevated glucose. Mice lacking this gene exhibit higher blood glucose levels with impaired initial insulin response after oral glucose load. Defect in this gene thus may contribute to the pathogenesis of diabetes. | NA |
| aldolase, fructose-bisphosphate B | 229 | ALDOB | ENSG00000136872 | Fructose-1,6-bisphosphate aldolase (EC 4.1.2.13) is a tetrameric glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Vertebrates have 3 aldolase isozymes which are distinguished by their electrophoretic and catalytic properties. Differences indicate that aldolases A, B, and C are distinct proteins, the products of a family of related ‘housekeeping’ genes exhibiting developmentally regulated expression of the different isozymes. The developing embryo produces aldolase A, which is produced in even greater amounts in adult muscle where it can be as much as 5% of total cellular protein. In adult liver, kidney and intestine, aldolase A expression is repressed and aldolase B is produced. In brain and other nervous tissue, aldolase A and C are expressed about equally. There is a high degree of homology between aldolase A and C. Defects in ALDOB cause hereditary fructose intolerance. | NA |
| chemerin chemokine-like receptor 1 | 1240 | CMKLR1 | ENSG00000174600 | NA | NA |
| hepcidin antimicrobial peptide | 57817 | HAMP | ENSG00000105697 | The product encoded by this gene is involved in the maintenance of iron homeostasis, and it is necessary for the regulation of iron storage in macrophages, and for intestinal iron absorption. The preproprotein is post-translationally cleaved into mature peptides of 20, 22 and 25 amino acids, and these active peptides are rich in cysteines, which form intramolecular bonds that stabilize their beta-sheet structures. These peptides exhibit antimicrobial activity against bacteria and fungi. Mutations in this gene cause hemochromatosis type 2B, also known as juvenile hemochromatosis, a disease caused by severe iron overload that results in cardiomyopathy, cirrhosis, and endocrine failure. | NA |
| fibrillarin-like 1 | ENSG00000188573 | FBLL1 | ENSG00000188573 | NA | NA |
| calmegin | 1047 | CLGN | ENSG00000153132 | Calmegin is a testis-specific endoplasmic reticulum chaperone protein. CLGN may play a role in spermatogeneisis and infertility. | NA |
| aspartate beta-hydroxylase domain containing 1 | 253982 | ASPHD1 | ENSG00000174939 | NA | NA |
| solute carrier family 16 member 9 | 220963 | SLC16A9 | ENSG00000165449 | NA | NA |
| integrin subunit alpha 8 | 8516 | ITGA8 | ENSG00000077943 | Integrins are heterodimeric transmembrane receptor proteins that mediate numerous cellular processes including cell adhesion, cytoskeletal rearrangement, and activation of cell signaling pathways. Integrins are composed of alpha and beta subunits. This gene encodes the alpha 8 subunit of the heterodimeric integrin alpha8beta1 protein. The encoded protein is a single-pass type 1 membrane protein that contains multiple FG-GAP repeats. This repeat is predicted to fold into a beta propeller structure. This gene regulates the recruitment of mesenchymal cells into epithelial structures, mediates cell-cell interactions, and regulates neurite outgrowth of sensory and motor neurons. The integrin alpha8beta1 protein thus plays an important role in wound-healing and organogenesis. Mutations in this gene have been associated with renal hypodysplasia/aplasia-1 (RHDA1) and with several animal models of chronic kidney disease. Alternate splicing results in multiple transcript variants encoding distinct isoforms. | NA |
| synuclein beta | 6620 | SNCB | ENSG00000074317 | This gene encodes a member of a small family of proteins that inhibit phospholipase D2 and may function in neuronal plasticity. The encoded protein is abundant in lesions of patients with Alzheimer disease. A mutation in this gene was found in individuals with dementia with Lewy bodies. Alternative splicing results in multiple transcript variants. | NA |
| amine oxidase, copper containing 3 | 8639 | AOC3 | ENSG00000131471 | This gene encodes a member of the semicarbazide-sensitive amine oxidase family. Copper amine oxidases catalyze the oxidative conversion of amines to aldehydes in the presence of copper and quinone cofactor. The encoded protein is localized to the cell surface, has adhesive properties as well as monoamine oxidase activity, and may be involved in leukocyte trafficking. Alterations in levels of the encoded protein may be associated with many diseases, including diabetes mellitus. A pseudogene of this gene has been described and is located approximately 9-kb downstream on the same chromosome. Alternative splicing results in multiple transcript variants. | NA |
| protein phosphatase 1 regulatory inhibitor subunit 1B | 84152 | PPP1R1B | ENSG00000131771 | This gene encodes a bifunctional signal transduction molecule. Dopaminergic and glutamatergic receptor stimulation regulates its phosphorylation and function as a kinase or phosphatase inhibitor. As a target for dopamine, this gene may serve as a therapeutic target for neurologic and psychiatric disorders. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| myocyte enhancer factor 2C | 4208 | MEF2C | ENSG00000081189 | This locus encodes a member of the MADS box transcription enhancer factor 2 (MEF2) family of proteins, which play a role in myogenesis. The encoded protein, MEF2 polypeptide C, has both trans-activating and DNA binding activities. This protein may play a role in maintaining the differentiated state of muscle cells. Mutations and deletions at this locus have been associated with severe mental retardation, stereotypic movements, epilepsy, and cerebral malformation. Alternatively spliced transcript variants have been described. | NA |
| NA | ENSG00000245864 | CTC-467M3.1 | ENSG00000245864 | NA | NA |
| RUN domain containing 3A | 10900 | RUNDC3A | ENSG00000108309 | NA | NA |
| poly(A) binding protein interacting protein 2B | 400961 | PAIP2B | ENSG00000124374 | Most mRNAs, except for histones, contain a 3-prime poly(A) tail. Poly(A)-binding protein (PABP; see MIM 604679) enhances translation by circularizing mRNA through its interaction with the translation initiation factor EIF4G1 (MIM 600495) and the poly(A) tail. Various PABP-binding proteins regulate PABP activity, including PAIP1 (MIM 605184), a translational stimulator, and PAIP2A (MIM 605604) and PAIP2B, translational inhibitors (Derry et al., 2006 [PubMed 17381337]). | NA |
| NA | ENSG00000251660 | AC007036.5 | ENSG00000251660 | NA | NA |
| glycine amidinotransferase | 2628 | GATM | ENSG00000171766 | This gene encodes a mitochondrial enzyme that belongs to the amidinotransferase family. This enzyme is involved in creatine biosynthesis, whereby it catalyzes the transfer of a guanido group from L-arginine to glycine, resulting in guanidinoacetic acid, the immediate precursor of creatine. Mutations in this gene cause arginine:glycine amidinotransferase deficiency, an inborn error of creatine synthesis characterized by mental retardation, language impairment, and behavioral disorders. | NA |
| protein disulfide isomerase family A member 2 | 64714 | PDIA2 | ENSG00000185615 | Protein disulfide isomerases (EC 5.3.4.1), such as PDIP, are endoplasmic reticulum (ER) resident proteins that catalyze protein folding and thiol-disulfide interchange reactions (Desilva et al., 1996 [PubMed 8561901]). | NA |
| ITPR1 antisense RNA 1 (head to head) | ENSG00000231249 | ITPR1-AS1 | ENSG00000231249 | NA | NA |
| spermine oxidase | 54498 | SMOX | ENSG00000088826 | Polyamines are ubiquitous polycationic alkylamines which include spermine, spermidine, putrescine, and agmatine. These molecules participate in a broad range of cellular functions which include cell cycle modulation, scavenging reactive oxygen species, and the control of gene expression. These molecules also play important roles in neurotransmission through their regulation of cell-surface receptor activity, involvement in intracellular signalling pathways, and their putative roles as neurotransmitters. This gene encodes an FAD-containing enzyme that catalyzes the oxidation of spermine to spermadine and secondarily produces hydrogen peroxide. Multiple transcript variants encoding different isoenzymes have been identified for this gene, some of which have failed to demonstrate significant oxidase activity on natural polyamine substrates. The characterized isoenzymes have distinctive biochemical characteristics and substrate specificities, suggesting the existence of additional levels of complexity in polyamine catabolism. | NA |
| keratin 8 | 3856 | KRT8 | ENSG00000170421 | This gene is a member of the type II keratin family clustered on the long arm of chromosome 12. Type I and type II keratins heteropolymerize to form intermediate-sized filaments in the cytoplasm of epithelial cells. The product of this gene typically dimerizes with keratin 18 to form an intermediate filament in simple single-layered epithelial cells. This protein plays a role in maintaining cellular structural integrity and also functions in signal transduction and cellular differentiation. Mutations in this gene cause cryptogenic cirrhosis. Alternatively spliced transcript variants have been found for this gene. | NA |
| NA | ENSG00000255498 | RP11-618K13.2 | ENSG00000255498 | NA | NA |
| NA | ENSG00000266844 | RP11-862L9.3 | ENSG00000266844 | NA | NA |
| integrin subunit alpha 3 | 3675 | ITGA3 | ENSG00000005884 | The gene encodes a member of the integrin alpha chain family of proteins. Integrins are heterodimeric integral membrane proteins composed of an alpha chain and a beta chain that function as cell surface adhesion molecules. The encoded preproprotein is proteolytically processed to generate light and heavy chains that comprise the alpha 3 subunit. This subunit joins with a beta 1 subunit to form an integrin that interacts with extracellular matrix proteins including members of the laminin family. Expression of this gene may be correlated with breast cancer metastasis. | NA |
| chromosome 2 open reading frame 82 | 389084 | C2orf82 | ENSG00000182600 | NA | NA |
| NA | NA | NA | ENSG00000165862 | NA | TRUE |
| matrix Gla protein | 4256 | MGP | ENSG00000111341 | The protein encoded by this gene is secreted and likely acts as an inhibitor of bone formation. The encoded protein is found in the organic matrix of bone and cartilage. Defects in this gene are a cause of Keutel syndrome (KS). Two transcript variants encoding different isoforms have been found for this gene. | NA |
| myozenin 1 | 58529 | MYOZ1 | ENSG00000177791 | The protein encoded by this gene is primarily expressed in the skeletal muscle, and belongs to the myozenin family. Members of this family function as calcineurin-interacting proteins that help tether calcineurin to the sarcomere of cardiac and skeletal muscle. They play an important role in modulation of calcineurin signaling. | NA |
| energy homeostasis associated | 375704 | ENHO | ENSG00000168913 | NA | NA |
| NA | ENSG00000269906 | RP11-248J18.2 | ENSG00000269906 | NA | NA |
| maturin, neural progenitor differentiation regulator homolog (Xenopus) | 222166 | MTURN | ENSG00000180354 | NA | NA |
| solute carrier family 47 member 1 | 55244 | SLC47A1 | ENSG00000142494 | This gene is located within the Smith-Magenis syndrome region on chromosome 17. It encodes a protein of unknown function. | NA |
| keratin 7 | 3855 | KRT7 | ENSG00000135480 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the simple epithelia lining the cavities of the internal organs and in the gland ducts and blood vessels. The genes encoding the type II cytokeratins are clustered in a region of chromosome 12q12-q13. Alternative splicing may result in several transcript variants; however, not all variants have been fully described. | NA |
| NA | ENSG00000269514 | RP11-370I10.12 | ENSG00000269514 | NA | NA |
| mucin 7, secreted | 4589 | MUC7 | ENSG00000171195 | This gene encodes a small salivary mucin, which is thought to play a role in facilitating the clearance of bacteria in the oral cavity and to aid in mastication, speech, and swallowing. The central domain of this glycoprotein contains tandem repeats, each composed of 23 amino acids. This antimicrobial protein has antibacterial and antifungal activity. The most common allele contains 6 repeats, and some alleles may be associated with susceptibility to asthma. Alternatively spliced transcript variants with different 5’ UTR, but encoding the same protein, have been found for this gene. | NA |
| solute carrier family 37 member 2 | 219855 | SLC37A2 | ENSG00000134955 | NA | NA |
| natriuretic peptide receptor 1 | 4881 | NPR1 | ENSG00000169418 | Guanylyl cyclases, catalyzing the production of cGMP from GTP, are classified as soluble and membrane forms (Garbers and Lowe, 1994 [PubMed 7982997]). The membrane guanylyl cyclases, often termed guanylyl cyclases A through F, form a family of cell-surface receptors with a similar topographic structure: an extracellular ligand-binding domain, a single membrane-spanning domain, and an intracellular region that contains a protein kinase-like domain and a cyclase catalytic domain. GC-A and GC-B function as receptors for natriuretic peptides; they are also referred to as atrial natriuretic peptide receptor A (NPR1) and type B (NPR2; MIM 108961). Also see NPR3 (MIM 108962), which encodes a protein with only the ligand-binding transmembrane and 37-amino acid cytoplasmic domains. NPR1 is a membrane-bound guanylate cyclase that serves as the receptor for both atrial and brain natriuretic peptides (ANP (MIM 108780) and BNP (MIM 600295), respectively). | NA |
| interleukin 15 | 3600 | IL15 | ENSG00000164136 | The protein encoded by this gene is a cytokine that regulates T and natural killer cell activation and proliferation. This cytokine and interleukine 2 share many biological activities. They are found to bind common hematopoietin receptor subunits, and may compete for the same receptor, and thus negatively regulate each other’s activity. The number of CD8+ memory cells is shown to be controlled by a balance between this cytokine and IL2. This cytokine induces the activation of JAK kinases, as well as the phosphorylation and activation of transcription activators STAT3, STAT5, and STAT6. Studies of the mouse counterpart suggested that this cytokine may increase the expression of apoptosis inhibitor BCL2L1/BCL-x(L), possibly through the transcription activation activity of STAT6, and thus prevent apoptosis. Alternatively spliced transcript variants of this gene have been reported. | NA |
| neuralized E3 ubiquitin protein ligase 1 | 9148 | NEURL1 | ENSG00000107954 | NA | NA |
| pellino E3 ubiquitin protein ligase family member 2 | 57161 | PELI2 | ENSG00000139946 | NA | NA |
| C-X-C motif chemokine ligand 1 | 2919 | CXCL1 | ENSG00000163739 | This antimicrobial gene encodes a member of the CXC subfamily of chemokines. The encoded protein is a secreted growth factor that signals through the G-protein coupled receptor, CXC receptor 2. This protein plays a role in inflammation and as a chemoattractant for neutrophils. Aberrant expression of this protein is associated with the growth and progression of certain tumors. A naturally occurring processed form of this protein has increased chemotactic activity. Alternate splicing results in coding and non-coding variants of this gene. A pseudogene of this gene is found on chromosome 4. | NA |
| arylsulfatase G | 22901 | ARSG | ENSG00000141337 | The protein encoded by this gene belongs to the sulfatase enzyme family. Sulfatases hydrolyze sulfate esters from sulfated steroids, carbohydrates, proteoglycans, and glycolipids. They are involved in hormone biosynthesis, modulation of cell signaling, and degradation of macromolecules. This protein displays arylsulfatase activity at acidic pH, as is typical of lysosomal sulfatases, and has been shown to localize in the lysosomes. Alternatively spliced transcript variants have been found for this gene. | NA |
| pentraxin 3 | 5806 | PTX3 | ENSG00000163661 | NA | NA |
| thromboxane A2 receptor | 6915 | TBXA2R | ENSG00000006638 | This gene encodes a member of the G protein-coupled receptor family. The protein interacts with thromboxane A2 to induce platelet aggregation and regulate hemostasis. A mutation in this gene results in a bleeding disorder. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| collagen type IV alpha 4 chain | 1286 | COL4A4 | ENSG00000081052 | This gene encodes one of the six subunits of type IV collagen, the major structural component of basement membranes. This particular collagen IV subunit, however, is only found in a subset of basement membranes. Like the other members of the type IV collagen gene family, this gene is organized in a head-to-head conformation with another type IV collagen gene so that each gene pair shares a common promoter. Mutations in this gene are associated with type II autosomal recessive Alport syndrome (hereditary glomerulonephropathy) and with familial benign hematuria (thin basement membrane disease). Two transcripts, differing only in their transcription start sites, have been identified for this gene and, as is common for collagen genes, multiple polyadenylation sites are found in the 3’ UTR. | NA |
| transmembrane protease, serine 5 | 80975 | TMPRSS5 | ENSG00000166682 | This gene encodes a protein that belongs to the serine protease family. Serine proteases are known to be involved in many physiological and pathological processes. Alternative splicing results in multiple transcript variants. | NA |
| transthyretin | 7276 | TTR | ENSG00000118271 | This gene encodes transthyretin, one of the three prealbumins including alpha-1-antitrypsin, transthyretin and orosomucoid. Transthyretin is a carrier protein; it transports thyroid hormones in the plasma and cerebrospinal fluid, and also transports retinol (vitamin A) in the plasma. The protein consists of a tetramer of identical subunits. More than 80 different mutations in this gene have been reported; most mutations are related to amyloid deposition, affecting predominantly peripheral nerve and/or the heart, and a small portion of the gene mutations is non-amyloidogenic. The diseases caused by mutations include amyloidotic polyneuropathy, euthyroid hyperthyroxinaemia, amyloidotic vitreous opacities, cardiomyopathy, oculoleptomeningeal amyloidosis, meningocerebrovascular amyloidosis, carpal tunnel syndrome, etc. | NA |
| solute carrier family 7 member 5 | 8140 | SLC7A5 | ENSG00000103257 | NA | NA |
| plakophilin 2 | 5318 | PKP2 | ENSG00000057294 | This gene encodes a member of the arm-repeat (armadillo) and plakophilin gene families. Plakophilin proteins contain numerous armadillo repeats, localize to cell desmosomes and nuclei, and participate in linking cadherins to intermediate filaments in the cytoskeleton. This gene product may regulate the signaling activity of beta-catenin. Two alternately spliced transcripts encoding two protein isoforms have been identified. A processed pseudogene with high similarity to this locus has been mapped to chromosome 12p13. | NA |
| leucine rich alpha-2-glycoprotein 1 | 116844 | LRG1 | ENSG00000171236 | The leucine-rich repeat (LRR) family of proteins, including LRG1, have been shown to be involved in protein-protein interaction, signal transduction, and cell adhesion and development. LRG1 is expressed during granulocyte differentiation (O’Donnell et al., 2002 [PubMed 12223515]). | NA |
| ADAM metallopeptidase with thrombospondin type 1 motif 7 | 11173 | ADAMTS7 | ENSG00000136378 | The protein encoded by this gene is a member of the ADAMTS (a disintegrin and metalloproteinase with thrombospondin motifs) family. Members of this family share several distinct protein modules, including a propeptide region, a metalloproteinase domain, a disintegrin-like domain, and a thrombospondin type 1 (TS) motif. Individual members of this family differ in the number of C-terminal TS motifs, and some have unique C-terminal domains. The encoded preproprotein is proteolytically processed to generate the mature enzyme. This enzyme contains two C-terminal TS motifs and may regulate vascular smooth muscle cell (VSMC) migration. Mutations in this gene may be associated with susceptibility to coronary artery disease. | NA |
| alpha-2-glycoprotein 1, zinc-binding | 563 | AZGP1 | ENSG00000160862 | NA | NA |
| G protein-coupled receptor kinase 5 | 2869 | GRK5 | ENSG00000198873 | This gene encodes a member of the guanine nucleotide-binding protein (G protein)-coupled receptor kinase subfamily of the Ser/Thr protein kinase family. The protein phosphorylates the activated forms of G protein-coupled receptors thus initiating their deactivation. It has also been shown to play a role in regulating the motility of polymorphonuclear leukocytes (PMNs). | NA |
| myosin, heavy chain 10, non-muscle | 4628 | MYH10 | ENSG00000133026 | This gene encodes a member of the myosin superfamily. The protein represents a conventional non-muscle myosin; it should not be confused with the unconventional myosin-10 (MYO10). Myosins are actin-dependent motor proteins with diverse functions including regulation of cytokinesis, cell motility, and cell polarity. Mutations in this gene have been associated with May-Hegglin anomaly and developmental defects in brain and heart. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| chromogranin A | 1113 | CHGA | ENSG00000100604 | The protein encoded by this gene is a member of the chromogranin/secretogranin family of neuroendocrine secretory proteins. It is found in secretory vesicles of neurons and endocrine cells. This gene product is a precursor to three biologically active peptides; vasostatin, pancreastatin, and parastatin. These peptides act as autocrine or paracrine negative modulators of the neuroendocrine system. Two other peptides, catestatin and chromofungin, have antimicrobial activity and antifungal activity, respectively. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| NA | ENSG00000224459 | RP11-169K16.4 | ENSG00000224459 | NA | NA |
| aldo-keto reductase family 7 member A3 | 22977 | AKR7A3 | ENSG00000162482 | Aldo-keto reductases, such as AKR7A3, are involved in the detoxification of aldehydes and ketones. | NA |
| cadherin 1 | 999 | CDH1 | ENSG00000039068 | This gene encodes a classical cadherin of the cadherin superfamily. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature glycoprotein. This calcium-dependent cell-cell adhesion protein is comprised of five extracellular cadherin repeats, a transmembrane region and a highly conserved cytoplasmic tail. Mutations in this gene are correlated with gastric, breast, colorectal, thyroid and ovarian cancer. Loss of function of this gene is thought to contribute to cancer progression by increasing proliferation, invasion, and/or metastasis. The ectodomain of this protein mediates bacterial adhesion to mammalian cells and the cytoplasmic domain is required for internalization. This gene is present in a gene cluster with other members of the cadherin family on chromosome 16. | NA |
| chitinase 3 like 1 | 1116 | CHI3L1 | ENSG00000133048 | Chitinases catalyze the hydrolysis of chitin, which is an abundant glycopolymer found in insect exoskeletons and fungal cell walls. The glycoside hydrolase 18 family of chitinases includes eight human family members. This gene encodes a glycoprotein member of the glycosyl hydrolase 18 family. The protein lacks chitinase activity and is secreted by activated macrophages, chondrocytes, neutrophils and synovial cells. The protein is thought to play a role in the process of inflammation and tissue remodeling. | NA |
| thyroglobulin | 7038 | TG | ENSG00000042832 | Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. | NA |
| mitogen-activated protein kinase 8 interacting protein 1 | 9479 | MAPK8IP1 | ENSG00000121653 | This gene encodes a regulator of the pancreatic beta-cell function. It is highly similar to JIP-1, a mouse protein known to be a regulator of c-Jun amino-terminal kinase (Mapk8). This protein has been shown to prevent MAPK8 mediated activation of transcription factors, and to decrease IL-1 beta and MAP kinase kinase 1 (MEKK1) induced apoptosis in pancreatic beta cells. This protein also functions as a DNA-binding transactivator of the glucose transporter GLUT2. RE1-silencing transcription factor (REST) is reported to repress the expression of this gene in insulin-secreting beta cells. This gene is found to be mutated in a type 2 diabetes family, and thus is thought to be a susceptibility gene for type 2 diabetes. | NA |
| family with sequence similarity 134 member B | 54463 | FAM134B | ENSG00000154153 | The protein encoded by this gene is a cis-Golgi transmembrane protein that may be necessary for the long-term survival of nociceptive and autonomic ganglion neurons. Mutations in this gene are a cause of hereditary sensory and autonomic neuropathy type IIB (HSAN IIB), and this gene may also play a role in susceptibility to vascular dementia. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| growth arrest specific 5 (non-protein coding) | 60674 | GAS5 | ENSG00000234741 | This gene produces a spliced long non-coding RNA and is a member of the 5’ terminal oligo-pyrimidine class of genes. It is a small nucleolar RNA host gene, containing multiple C/D box snoRNA genes in its introns. Part of the secondary RNA structure of the encoded transcript mimics glucocorticoid response element (GRE) which means it can bind to the DNA binding domain of the glucocorticoid receptor (nuclear receptor subfamily 3, group C, member 1). This action blocks the glucocorticoid receptor from being activated and thereby stops it from regulating the transcription of its target genes. This transcript is also thought to regulate the transcriptional activity of other receptors, such as androgen, progesterone and mineralocorticoid receptors, that can bind to its GRE mimic region. Multiple functions have been associated with this transcript, including cellular growth arrest and apoptosis. It has also been identified as a potential tumor suppressor, with its down-regulation associated with cancer in multiple different tissues. | NA |
| angiotensinogen | 183 | AGT | ENSG00000135744 | The protein encoded by this gene, pre-angiotensinogen or angiotensinogen precursor, is expressed in the liver and is cleaved by the enzyme renin in response to lowered blood pressure. The resulting product, angiotensin I, is then cleaved by angiotensin converting enzyme (ACE) to generate the physiologically active enzyme angiotensin II. The protein is involved in maintaining blood pressure and in the pathogenesis of essential hypertension and preeclampsia. Mutations in this gene are associated with susceptibility to essential hypertension, and can cause renal tubular dysgenesis, a severe disorder of renal tubular development. Defects in this gene have also been associated with non-familial structural atrial fibrillation, and inflammatory bowel disease. | NA |
| kinesin family member 1A | 547 | KIF1A | ENSG00000130294 | The protein encoded by this gene is a member of the kinesin family and functions as an anterograde motor protein that transports membranous organelles along axonal microtubules. Mutations at this locus have been associated with spastic paraplegia-30 and hereditary sensory neuropathy IIC. Alternatively spliced transcript variants encoding distinct isoforms have been described. | NA |
| immunoglobulin heavy constant gamma 1 (G1m marker) | ENSG00000211896 | IGHG1 | ENSG00000211896 | NA | NA |
| retinol dehydrogenase 10 (all-trans) | 157506 | RDH10 | ENSG00000121039 | This gene encodes a retinol dehydrogenase, which converts all-trans-retinol to all-trans-retinal, with preference for NADP as a cofactor. Studies in mice suggest that this protein is essential for synthesis of embryonic retinoic acid and is required for limb, craniofacial, and organ development. | NA |
| ATP binding cassette subfamily A member 1 | 19 | ABCA1 | ENSG00000165029 | The membrane-associated protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intracellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the ABC1 subfamily. Members of the ABC1 subfamily comprise the only major ABC subfamily found exclusively in multicellular eukaryotes. With cholesterol as its substrate, this protein functions as a cholesteral efflux pump in the cellular lipid removal pathway. Mutations in this gene have been associated with Tangier’s disease and familial high-density lipoprotein deficiency. | NA |
| pleckstrin and Sec7 domain containing 2 | 84249 | PSD2 | ENSG00000146005 | NA | NA |
| NA | NA | NA | ENSG00000203306 | NA | TRUE |
| solute carrier family 18 member A2 | 6571 | SLC18A2 | ENSG00000165646 | The vesicular monoamine transporter acts to accumulate cytosolic monoamines into synaptic vesicles, using the proton gradient maintained across the synaptic vesicular membrane. Its proper function is essential to the correct activity of the monoaminergic systems that have been implicated in several human neuropsychiatric disorders. The transporter is a site of action of important drugs, including reserpine and tetrabenazine (summary by Peter et al., 1993 [PubMed 7905859]). See also SLC18A1 (MIM 193002). | NA |
| indolethylamine N-methyltransferase | 11185 | INMT | ENSG00000241644 | N-methylation of endogenous and xenobiotic compounds is a major method by which they are degraded. This gene encodes an enzyme that N-methylates indoles such as tryptamine. Alternative splicing results in multiple transcript variants. Read-through transcription also exists between this gene and the downstream FAM188B (family with sequence similarity 188, member B) gene. | NA |
| insulin like 3 | 3640 | INSL3 | ENSG00000248099 | This gene encodes a member of the insulin-like hormone superfamily. The encoded protein is mainly produced in gonadal tissues. Studies of the mouse counterpart suggest that this gene may be involved in the development of urogenital tract and female fertility. This protein may also act as a hormone to regulate growth and differentiation of gubernaculum, and thus mediating intra-abdominal testicular descent. Mutations in this gene may lead to cryptorchidism. Alternate splicing results in multiple transcript variants. | NA |
| immunoglobulin lambda like polypeptide 5 | 100423062 | IGLL5 | ENSG00000254709 | This gene encodes one of the immunoglobulin lambda-like polypeptides. It is located within the immunoglobulin lambda locus but it does not require somatic rearrangement for expression. The first exon of this gene is unrelated to immunoglobulin variable genes; the second and third exons are the immunoglobulin lambda joining 1 and the immunoglobulin lambda constant 1 gene segments. Alternative splicing results in multiple transcript variants. | NA |
| chymotrypsin like | 1506 | CTRL | ENSG00000141086 | NA | NA |
| essential meiotic structure-specific endonuclease subunit 2 | 197342 | EME2 | ENSG00000197774 | EME2 forms a heterodimer with MUS81 (MIM 606591) that functions as an XPF (MIM 278760)-type flap/fork endonuclease in DNA repair (Ciccia et al., 2007 [PubMed 17289582]). | NA |
| inositol 1,4,5-trisphosphate receptor type 1 | 3708 | ITPR1 | ENSG00000150995 | This gene encodes an intracellular receptor for inositol 1,4,5-trisphosphate. Upon stimulation by inositol 1,4,5-trisphosphate, this receptor mediates calcium release from the endoplasmic reticulum. Mutations in this gene cause spinocerebellar ataxia type 15, a disease associated with an heterogeneous group of cerebellar disorders. Multiple transcript variants have been identified for this gene. | NA |
| lipocalin 2 | 3934 | LCN2 | ENSG00000148346 | This gene encodes a protein that belongs to the lipocalin family. Members of this family transport small hydrophobic molecules such as lipids, steroid hormones and retinoids. The protein encoded by this gene is a neutrophil gelatinase-associated lipocalin and plays a role in innate immunity by limiting bacterial growth as a result of sequestering iron-containing siderophores. The presence of this protein in blood and urine is an early biomarker of acute kidney injury. This protein is thought to be be involved in multiple cellular processes, including maintenance of skin homeostasis, and suppression of invasiveness and metastasis. Mice lacking this gene are more susceptible to bacterial infection than wild type mice. | NA |
| asparaginase like 1 | 80150 | ASRGL1 | ENSG00000162174 | NA | NA |
| ANO1 antisense RNA 1 | ENSG00000254902 | ANO1-AS1 | ENSG00000254902 | NA | NA |
| visinin like 1 | 7447 | VSNL1 | ENSG00000163032 | This gene is a member of the visinin/recoverin subfamily of neuronal calcium sensor proteins. The encoded protein is strongly expressed in granule cells of the cerebellum where it associates with membranes in a calcium-dependent manner and modulates intracellular signaling pathways of the central nervous system by directly or indirectly regulating the activity of adenylyl cyclase. Alternatively spliced transcript variants have been observed, but their full-length nature has not been determined. | NA |
| v-myc avian myelocytomatosis viral oncogene lung carcinoma derived homolog | 4610 | MYCL | ENSG00000116990 | NA | NA |
| heat shock protein family B (small) member 7 | 27129 | HSPB7 | ENSG00000173641 | NA | NA |
| heparan sulfate-glucosamine 3-sulfotransferase 3B1 | 9953 | HS3ST3B1 | ENSG00000125430 | The protein encoded by this gene is a type II integral membrane protein that belongs to the 3-O-sulfotransferases family. These proteins catalyze the addition of sulfate groups at the 3-OH position of glucosamine in heparan sulfate. The substrate specificity of individual members of the family is based on prior modification of the heparan sulfate chain, thus allowing different members of the family to generate binding sites for different proteins on the same heparan sulfate chain. Following treatment with a histone deacetylase inhibitor, expression of this gene is activated in a pancreatic cell line. The increased expression results in promotion of the epithelial-mesenchymal transition. In addition, the modification catalyzed by this protein allows herpes simplex virus membrane fusion and penetration. A very closely related homolog with an almost identical sulfotransferase domain maps less than 1 Mb away. Alternative splicing results in multiple transcript variants. | NA |
| mitochondrial elongation factor 2 | 125170 | MIEF2 | ENSG00000177427 | This gene encodes an outer mitochondrial membrane protein that functions in the regulation of mitochondrial morphology. It can directly recruit the fission mediator dynamin-related protein 1 (Drp1) to the mitochondrial surface. The gene is located within the Smith-Magenis syndrome region on chromosome 17. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| thyroid peroxidase | 7173 | TPO | ENSG00000115705 | This gene encodes a membrane-bound glycoprotein. The encoded protein acts as an enzyme and plays a central role in thyroid gland function. The protein functions in the iodination of tyrosine residues in thyroglobulin and phenoxy-ester formation between pairs of iodinated tyrosines to generate the thyroid hormones, thyroxine and triiodothyronine. Mutations in this gene are associated with several disorders of thyroid hormonogenesis, including congenital hypothyroidism, congenital goiter, and thyroid hormone organification defect IIA. Multiple transcript variants encoding distinct isoforms have been identified for this gene, but the full-length nature of some variants has not been determined. | NA |
| synaptonemal complex central element protein 1 | 93426 | SYCE1 | ENSG00000171772 | NA | NA |
| CUB and zona pellucida like domains 1 | 50624 | CUZD1 | ENSG00000138161 | NA | NA |
| LIM domain binding 3 | 11155 | LDB3 | ENSG00000122367 | This gene encodes a PDZ domain-containing protein. PDZ motifs are modular protein-protein interaction domains consisting of 80-120 amino acid residues. PDZ domain-containing proteins interact with each other in cytoskeletal assembly or with other proteins involved in targeting and clustering of membrane proteins. The protein encoded by this gene interacts with alpha-actinin-2 through its N-terminal PDZ domain and with protein kinase C via its C-terminal LIM domains. The LIM domain is a cysteine-rich motif defined by 50-60 amino acids containing two zinc-binding modules. This protein also interacts with all three members of the myozenin family. Mutations in this gene have been associated with myofibrillar myopathy and dilated cardiomyopathy. Alternatively spliced transcript variants encoding different isoforms have been identified; all isoforms have N-terminal PDZ domains while only longer isoforms (1, 2 and 5) have C-terminal LIM domains. | NA |
| neurexophilin 3 | 11248 | NXPH3 | ENSG00000182575 | NA | NA |
| N-myc downstream regulated 1 | 10397 | NDRG1 | ENSG00000104419 | This gene is a member of the N-myc downregulated gene family which belongs to the alpha/beta hydrolase superfamily. The protein encoded by this gene is a cytoplasmic protein involved in stress responses, hormone responses, cell growth, and differentiation. The encoded protein is necessary for p53-mediated caspase activation and apoptosis. Mutations in this gene are a cause of Charcot-Marie-Tooth disease type 4D, and expression of this gene may be a prognostic indicator for several types of cancer. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| geminin, DNA replication inhibitor | 51053 | GMNN | ENSG00000112312 | This gene encodes a protein that plays a critical role in cell cycle regulation. The encoded protein inhibits DNA replication by binding to DNA replication factor Cdt1, preventing the incorporation of minichromosome maintenance proteins into the pre-replication complex. The encoded protein is expressed during the S and G2 phases of the cell cycle and is degraded by the anaphase-promoting complex during the metaphase-anaphase transition. Increased expression of this gene may play a role in several malignancies including colon, rectal and breast cancer. Alternatively spliced transcript variants have been observed for this gene, and two pseudogenes of this gene are located on the short arm of chromosome 16. | NA |
| cornichon family AMPA receptor auxiliary protein 2 | 254263 | CNIH2 | ENSG00000174871 | The protein encoded by this gene is an auxiliary subunit of the ionotropic glutamate receptor of the AMPA subtype. AMPA receptors mediate fast synaptic neurotransmission in the central nervous system. This protein has been reported to interact with the Type I AMPA receptor regulatory protein isoform gamma-8 to control assembly of hippocampal AMPA receptor complexes, thereby modulating receptor gating and pharmacology. Alternative splicing results in multiple transcript variants. | NA |
| eukaryotic translation elongation factor 1 beta 2 pseudogene 2 | ENSG00000213864 | EEF1B2P2 | ENSG00000213864 | NA | NA |
| frizzled class receptor 1 | 8321 | FZD1 | ENSG00000157240 | Members of the ‘frizzled’ gene family encode 7-transmembrane domain proteins that are receptors for Wnt signaling proteins. The FZD1 protein contains a signal peptide, a cysteine-rich domain in the N-terminal extracellular region, 7 transmembrane domains, and a C-terminal PDZ domain-binding motif. The FZD1 transcript is expressed in various tissues. | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",3,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[4,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| X_id | symbol | query | name | summary | notfound |
|---|---|---|---|---|---|
| 3512 | JCHAIN | ENSG00000132465 | joining chain of multimeric IgA and IgM | NA | NA |
| ENSG00000211899 | IGHM | ENSG00000211899 | immunoglobulin heavy constant mu | NA | NA |
| ENSG00000211677 | IGLC2 | ENSG00000211677 | immunoglobulin lambda constant 2 (Kern-Oz- marker) | NA | NA |
| ENSG00000211679 | IGLC3 | ENSG00000211679 | immunoglobulin lambda constant 3 (Kern-Oz+ marker) | NA | NA |
| 100423062 | IGLL5 | ENSG00000254709 | immunoglobulin lambda like polypeptide 5 | This gene encodes one of the immunoglobulin lambda-like polypeptides. It is located within the immunoglobulin lambda locus but it does not require somatic rearrangement for expression. The first exon of this gene is unrelated to immunoglobulin variable genes; the second and third exons are the immunoglobulin lambda joining 1 and the immunoglobulin lambda constant 1 gene segments. Alternative splicing results in multiple transcript variants. | NA |
| 10900 | RUNDC3A | ENSG00000108309 | RUN domain containing 3A | NA | NA |
| ENSG00000211675 | IGLC1 | ENSG00000211675 | immunoglobulin lambda constant 1 (Mcg marker) | NA | NA |
| 51316 | PLAC8 | ENSG00000145287 | placenta specific 8 | NA | NA |
| 100507387 | LOC100507387 | ENSG00000182230 | uncharacterized LOC100507387 | NA | NA |
| 202134 | FAM153B | ENSG00000182230 | family with sequence similarity 153 member B | NA | NA |
| ENSG00000211895 | IGHA1 | ENSG00000211895 | immunoglobulin heavy constant alpha 1 | NA | NA |
| ENSG00000211893 | IGHG2 | ENSG00000211893 | immunoglobulin heavy constant gamma 2 (G2m marker) | NA | NA |
| 401027 | C2orf66 | ENSG00000187944 | chromosome 2 open reading frame 66 | NA | NA |
| ENSG00000253364 | RP11-731F5.2 | ENSG00000253364 | NA | NA | NA |
| 973 | CD79A | ENSG00000105369 | CD79a molecule | The B lymphocyte antigen receptor is a multimeric complex that includes the antigen-specific component, surface immunoglobulin (Ig). Surface Ig non-covalently associates with two other proteins, Ig-alpha and Ig-beta, which are necessary for expression and function of the B-cell antigen receptor. This gene encodes the Ig-alpha protein of the B-cell antigen component. Alternatively spliced transcript variants encoding different isoforms have been described. | NA |
| 11065 | UBE2C | ENSG00000175063 | ubiquitin conjugating enzyme E2 C | The modification of proteins with ubiquitin is an important cellular mechanism for targeting abnormal or short-lived proteins for degradation. Ubiquitination involves at least three classes of enzymes: ubiquitin-activating enzymes, ubiquitin-conjugating enzymes, and ubiquitin-protein ligases. This gene encodes a member of the E2 ubiquitin-conjugating enzyme family. The encoded protein is required for the destruction of mitotic cyclins and for cell cycle progression, and may be involved in cancer progression. Multiple transcript variants encoding different isoforms have been found for this gene. Pseudogenes of this gene have been defined on chromosomes 4, 14, 15, 18, and 19. | NA |
| 3801 | KIFC3 | ENSG00000140859 | kinesin family member C3 | This gene encodes a member of the kinesin-14 family of microtubule motors. Members of this family play a role in the formation, maintenance and remodeling of the bipolar mitotic spindle. The protein encoded by this gene has cytoplasmic functions in the interphase cells. It may also be involved in the final stages of cytokinesis. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| 5909 | RAP1GAP | ENSG00000076864 | RAP1 GTPase activating protein | This gene encodes a type of GTPase-activating-protein (GAP) that down-regulates the activity of the ras-related RAP1 protein. RAP1 acts as a molecular switch by cycling between an inactive GDP-bound form and an active GTP-bound form. The product of this gene, RAP1GAP, promotes the hydrolysis of bound GTP and hence returns RAP1 to the inactive state whereas other proteins, guanine nucleotide exchange factors (GEFs), act as RAP1 activators by facilitating the conversion of RAP1 from the GDP- to the GTP-bound form. In general, ras subfamily proteins, such as RAP1, play key roles in receptor-linked signaling pathways that control cell growth and differentiation. RAP1 plays a role in diverse processes such as cell proliferation, adhesion, differentiation, and embryogenesis. Alternative splicing results in multiple transcript variants encoding distinct proteins. | NA |
| 23397 | NCAPH | ENSG00000121152 | non-SMC condensin I complex subunit H | This gene encodes a member of the barr gene family and a regulatory subunit of the condensin complex. This complex is required for the conversion of interphase chromatin into condensed chromosomes. The protein encoded by this gene is associated with mitotic chromosomes, except during the early phase of chromosome condensation. During interphase, the protein has a distinct punctate nucleolar localization. Alternatively spliced transcript variants encoding different proteins have been described. | NA |
| ENSG00000211897 | IGHG3 | ENSG00000211897 | immunoglobulin heavy constant gamma 3 (G3m marker) | NA | NA |
| ENSG00000211896 | IGHG1 | ENSG00000211896 | immunoglobulin heavy constant gamma 1 (G1m marker) | NA | NA |
| 4085 | MAD2L1 | ENSG00000164109 | MAD2 mitotic arrest deficient-like 1 (yeast) | MAD2L1 is a component of the mitotic spindle assembly checkpoint that prevents the onset of anaphase until all chromosomes are properly aligned at the metaphase plate. MAD2L1 is related to the MAD2L2 gene located on chromosome 1. A MAD2 pseudogene has been mapped to chromosome 14. | NA |
| 51203 | NUSAP1 | ENSG00000137804 | nucleolar and spindle associated protein 1 | NUSAP1 is a nucleolar-spindle-associated protein that plays a role in spindle microtubule organization (Raemaekers et al., 2003 [PubMed 12963707]). | NA |
| 122618 | PLD4 | ENSG00000166428 | phospholipase D family member 4 | NA | NA |
| 933 | CD22 | ENSG00000012124 | CD22 molecule | NA | NA |
| NA | NA | ENSG00000256390 | NA | NA | TRUE |
| 9547 | CXCL14 | ENSG00000145824 | C-X-C motif chemokine ligand 14 | This antimicrobial gene belongs to the cytokine gene family which encode secreted proteins involved in immunoregulatory and inflammatory processes. The protein encoded by this gene is structurally related to the CXC (Cys-X-Cys) subfamily of cytokines. Members of this subfamily are characterized by two cysteines separated by a single amino acid. This cytokine displays chemotactic activity for monocytes but not for lymphocytes, dendritic cells, neutrophils or macrophages. It has been implicated that this cytokine is involved in the homeostasis of monocyte-derived macrophages rather than in inflammation. | NA |
| 5443 | POMC | ENSG00000115138 | proopiomelanocortin | This gene encodes a preproprotein that undergoes extensive, tissue-specific, post-translational processing via cleavage by subtilisin-like enzymes known as prohormone convertases. There are eight potential cleavage sites within the preproprotein and, depending on tissue type and the available convertases, processing may yield as many as ten biologically active peptides involved in diverse cellular functions. The encoded protein is synthesized mainly in corticotroph cells of the anterior pituitary where four cleavage sites are used; adrenocorticotrophin, essential for normal steroidogenesis and the maintenance of normal adrenal weight, and lipotropin beta are the major end products. In other tissues, including the hypothalamus, placenta, and epithelium, all cleavage sites may be used, giving rise to peptides with roles in pain and energy homeostasis, melanocyte stimulation, and immune modulation. These include several distinct melanotropins, lipotropins, and endorphins that are contained within the adrenocorticotrophin and beta-lipotropin peptides. The antimicrobial melanotropin alpha peptide exhibits antibacterial and antifungal activity. Mutations in this gene have been associated with early onset obesity, adrenal insufficiency, and red hair pigmentation. Alternatively spliced transcript variants encoding the same protein have been described. | NA |
| 55619 | DOCK10 | ENSG00000135905 | dedicator of cytokinesis 10 | This gene encodes a member of the dedicator of cytokinesis protein family. Members of this family are guanosine nucleotide exchange factors for Rho GTPases and defined by the presence of conserved DOCK-homology regions. The encoded protein belongs to the D (or Zizimin) subfamily of DOCK proteins, which also contain an N-terminal pleckstrin homology domain. Alternatively spliced transcript variants that encode different isoforms have been described. | NA |
| 22974 | TPX2 | ENSG00000088325 | TPX2, microtubule nucleation factor | NA | NA |
| ENSG00000223353 | RP11-290P14.2 | ENSG00000223353 | NA | NA | NA |
| 83461 | CDCA3 | ENSG00000111665 | cell division cycle associated 3 | NA | NA |
| 118430 | MUCL1 | ENSG00000172551 | mucin like 1 | NA | NA |
| 1145 | CHRNE | ENSG00000108556 | cholinergic receptor nicotinic epsilon subunit | Acetylcholine receptors at mature mammalian neuromuscular junctions are pentameric protein complexes composed of four subunits in the ratio of two alpha subunits to one beta, one epsilon, and one delta subunit. The acetylcholine receptor changes subunit composition shortly after birth when the epsilon subunit replaces the gamma subunit seen in embryonic receptors. Mutations in the epsilon subunit are associated with congenital myasthenic syndrome. | NA |
| 5888 | RAD51 | ENSG00000051180 | RAD51 recombinase | The protein encoded by this gene is a member of the RAD51 protein family. RAD51 family members are highly similar to bacterial RecA and Saccharomyces cerevisiae Rad51, and are known to be involved in the homologous recombination and repair of DNA. This protein can interact with the ssDNA-binding protein RPA and RAD52, and it is thought to play roles in homologous pairing and strand transfer of DNA. This protein is also found to interact with BRCA1 and BRCA2, which may be important for the cellular response to DNA damage. BRCA2 is shown to regulate both the intracellular localization and DNA-binding ability of this protein. Loss of these controls following BRCA2 inactivation may be a key event leading to genomic instability and tumorigenesis. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| 221424 | LRRC73 | ENSG00000204052 | leucine rich repeat containing 73 | NA | NA |
| 51676 | ASB2 | ENSG00000100628 | ankyrin repeat and SOCS box containing 2 | This gene encodes a member of the ankyrin repeat and SOCS box-containing (ASB) protein family. These proteins play a role in protein degradation by coupling suppressor of cytokine signalling (SOCS) proteins with the elongin BC complex. The encoded protein is a subunit of a multimeric E3 ubiquitin ligase complex that mediates the degradation of actin-binding proteins. This gene plays a role in retinoic acid-induced growth inhibition and differentiation of myeloid leukemia cells. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| 144406 | WDR66 | ENSG00000158023 | WD repeat domain 66 | This protein encoded by this gene belongs to the WD repeat-containing family of proteins, which function in the formation of protein-protein complexes in a variety of biological pathways. This family member appears to function in the determination of mean platelet volume (MPV), and polymorphisms in this gene have been associated with variance in MPV. Alternative splicing of this gene results in multiple transcript variants. | NA |
| 113130 | CDCA5 | ENSG00000146670 | cell division cycle associated 5 | NA | NA |
| 79696 | ZC2HC1C | ENSG00000119703 | zinc finger C2HC-type containing 1C | NA | NA |
| 51659 | GINS2 | ENSG00000131153 | GINS complex subunit 2 | The yeast heterotetrameric GINS complex is made up of Sld5 (GINS4; MIM 610611), Psf1 (GINS1; MIM 610608), Psf2, and Psf3 (GINS3; MIM 610610). The formation of this complex is essential for the initiation of DNA replication in yeast and Xenopus egg extracts (Ueno et al., 2005 [PubMed 16287864]). See GINS1 for additional information about the GINS complex. | NA |
| ENSG00000225062 | CATIP-AS1 | ENSG00000225062 | CATIP antisense RNA 1 | NA | NA |
| ENSG00000204677 | FAM153C | ENSG00000204677 | family with sequence similarity 153 member C | NA | NA |
| 7038 | TG | ENSG00000042832 | thyroglobulin | Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. | NA |
| 5800 | PTPRO | ENSG00000151490 | protein tyrosine phosphatase, receptor type O | This gene encodes a member of the R3 subtype family of receptor-type protein tyrosine phosphatases. These proteins are localized to the apical surface of polarized cells and may have tissue-specific functions through activation of Src family kinases. This gene contains two distinct promoters, and alternatively spliced transcript variants encoding multiple isoforms have been observed. The encoded proteins may have multiple isoform-specific and tissue-specific functions, including the regulation of osteoclast production and activity, inhibition of cell proliferation and facilitation of apoptosis. This gene is a candidate tumor suppressor, and decreased expression of this gene has been observed in several types of cancer. | NA |
| ENSG00000211890 | IGHA2 | ENSG00000211890 | immunoglobulin heavy constant alpha 2 (A2m marker) | NA | NA |
| 1081 | CGA | ENSG00000135346 | glycoprotein hormones, alpha polypeptide | The four human glycoprotein hormones chorionic gonadotropin (CG), luteinizing hormone (LH), follicle stimulating hormone (FSH), and thyroid stimulating hormone (TSH) are dimers consisting of alpha and beta subunits that are associated noncovalently. The alpha subunits of these hormones are identical, however, their beta chains are unique and confer biological specificity. The protein encoded by this gene is the alpha subunit and belongs to the glycoprotein hormones alpha chain family. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| 6865 | TACR2 | ENSG00000075073 | tachykinin receptor 2 | This gene belongs to a family of genes that function as receptors for tachykinins. Receptor affinities are specified by variations in the 5’-end of the sequence. The receptors belonging to this family are characterized by interactions with G proteins and 7 hydrophobic transmembrane regions. This gene encodes the receptor for the tachykinin neuropeptide substance K, also referred to as neurokinin A. | NA |
| 11339 | OIP5 | ENSG00000104147 | Opa interacting protein 5 | The protein encoded by this gene localizes to centromeres, where it is essential for recruitment of CENP-A through the mediator Holliday junction recognition protein. Expression of this gene is upregulated in several cancers, making it a putative therapeutic target. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| 5891 | MOK | ENSG00000080823 | MOK protein kinase | This gene belongs to the MAP kinase superfamily. The gene was found to be regulated by caudal type transcription factor 2 (Cdx2) protein. The encoded protein, which is localized to epithelial cells in the intestinal crypt, may play a role in growth arrest and differentiation of cells of upper crypt and lower villus regions. Multiple alternatively spliced transcript variants encoding different isoforms have been observed for this gene. | NA |
| 4157 | MC1R | ENSG00000258839 | melanocortin 1 receptor | This intronless gene encodes the receptor protein for melanocyte-stimulating hormone (MSH). The encoded protein, a seven pass transmembrane G protein coupled receptor, controls melanogenesis. Two types of melanin exist: red pheomelanin and black eumelanin. Gene mutations that lead to a loss in function are associated with increased pheomelanin production, which leads to lighter skin and hair color. Eumelanin is photoprotective but pheomelanin may contribute to UV-induced skin damage by generating free radicals upon UV radiation. Binding of MSH to its receptor activates the receptor and stimulates eumelanin synthesis. This receptor is a major determining factor in sun sensitivity and is a genetic risk factor for melanoma and non-melanoma skin cancer. Over 30 variant alleles have been identified which correlate with skin and hair color, providing evidence that this gene is an important component in determining normal human pigment variation. | NA |
| 9455 | HOMER2 | ENSG00000103942 | homer scaffolding protein 2 | This gene encodes a member of the homer family of dendritic proteins. Members of this family regulate group 1 metabotrophic glutamate receptor function. The encoded protein is a postsynaptic density scaffolding protein. Alternative splicing results in multiple transcript variants. Two related pseudogenes have been identified on chromosome 14. | NA |
| 283284 | IGSF22 | ENSG00000179057 | immunoglobulin superfamily member 22 | NA | NA |
| 83450 | DRC3 | ENSG00000171962 | dynein regulatory complex subunit 3 | NA | NA |
| 64105 | CENPK | ENSG00000123219 | centromere protein K | CENPK is a subunit of a CENPH (MIM 605607)-CENPI (MIM 300065)-associated centromeric complex that targets CENPA (MIM 117139) to centromeres and is required for proper kinetochore function and mitotic progression (Okada et al., 2006 [PubMed 16622420]). | NA |
| 84057 | MND1 | ENSG00000121211 | meiotic nuclear divisions 1 | The product of the MND1 gene associates with HOP2 (MIM 608665) to form a stable heterodimeric complex that binds DNA and stimulates the recombinase activity of RAD51 (MIM 179617) and DMC1 (MIM 602721) (Chi et al., 2007 [PubMed 17639080]). Both the MND1 and HOP2 genes are indispensable for meiotic recombination. | NA |
| 91147 | TMEM67 | ENSG00000164953 | transmembrane protein 67 | The protein encoded by this gene localizes to the primary cilium and to the plasma membrane. The gene functions in centriole migration to the apical membrane and formation of the primary cilium. Multiple transcript variants encoding different isoforms have been found for this gene. Defects in this gene are a cause of Meckel syndrome type 3 (MKS3) and Joubert syndrome type 6 (JBTS6). | NA |
| 2649 | NR6A1 | ENSG00000148200 | nuclear receptor subfamily 6 group A member 1 | This gene encodes an orphan nuclear receptor which is a member of the nuclear hormone receptor family. Its expression pattern suggests that it may be involved in neurogenesis and germ cell development. The protein can homodimerize and bind DNA, but in vivo targets have not been identified. Alternate splicing results in multiple transcript variants. | NA |
| NA | NA | ENSG00000034063 | NA | NA | TRUE |
| 23762 | OSBP2 | ENSG00000184792 | oxysterol binding protein 2 | The protein encoded by this gene contains a pleckstrin homology (PH) domain and an oxysterol-binding region. It binds oxysterols such as 7-ketocholesterol and may inhibit their cytotoxicity. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| ENSG00000250654 | RP11-834C11.7 | ENSG00000250654 | NA | NA | NA |
| NA | NA | ENSG00000234603 | NA | NA | TRUE |
| 55143 | CDCA8 | ENSG00000134690 | cell division cycle associated 8 | This gene encodes a component of the chromosomal passenger complex. This complex is an essential regulator of mitosis and cell division. This protein is cell-cycle regulated and is required for chromatin-induced microtubule stabilization and spindle formation. Alternate splicing results in multiple transcript variants. Pseudgenes of this gene are found on chromosomes 7, 8 and 16. | NA |
| ENSG00000166770 | ZNF667-AS1 | ENSG00000166770 | ZNF667 antisense RNA 1 (head to head) | NA | NA |
| 1047 | CLGN | ENSG00000153132 | calmegin | Calmegin is a testis-specific endoplasmic reticulum chaperone protein. CLGN may play a role in spermatogeneisis and infertility. | NA |
| 118491 | CFAP70 | ENSG00000156042 | cilia and flagella associated protein 70 | NA | NA |
| NA | NA | ENSG00000260655 | NA | NA | TRUE |
| 63934 | ZNF667 | ENSG00000198046 | zinc finger protein 667 | NA | NA |
| 7083 | TK1 | ENSG00000167900 | thymidine kinase 1 | NA | NA |
| 229 | ALDOB | ENSG00000136872 | aldolase, fructose-bisphosphate B | Fructose-1,6-bisphosphate aldolase (EC 4.1.2.13) is a tetrameric glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Vertebrates have 3 aldolase isozymes which are distinguished by their electrophoretic and catalytic properties. Differences indicate that aldolases A, B, and C are distinct proteins, the products of a family of related ‘housekeeping’ genes exhibiting developmentally regulated expression of the different isozymes. The developing embryo produces aldolase A, which is produced in even greater amounts in adult muscle where it can be as much as 5% of total cellular protein. In adult liver, kidney and intestine, aldolase A expression is repressed and aldolase B is produced. In brain and other nervous tissue, aldolase A and C are expressed about equally. There is a high degree of homology between aldolase A and C. Defects in ALDOB cause hereditary fructose intolerance. | NA |
| 283417 | DPY19L2 | ENSG00000177990 | dpy-19 like 2 | The protein encoded by this gene belongs to the dpy-19 family. It is highly expressed in testis, and is required for sperm head elongation and acrosome formation during spermatogenesis. Mutations in this gene are associated with an infertility disorder, spermatogenic failure type 9 (SPGF9). | NA |
| 3976 | LIF | ENSG00000128342 | leukemia inhibitory factor | The protein encoded by this gene is a pleiotropic cytokine with roles in several different systems. It is involved in the induction of hematopoietic differentiation in normal and myeloid leukemia cells, induction of neuronal cell differentiation, regulator of mesenchymal to epithelial conversion during kidney development, and may also have a role in immune tolerance at the maternal-fetal interface. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| 5880 | RAC2 | ENSG00000128340 | ras-related C3 botulinum toxin substrate 2 (rho family, small GTP binding protein Rac2) | This gene encodes a member of the Ras superfamily of small guanosine triphosphate (GTP)-metabolizing proteins. The encoded protein localizes to the plasma membrane, where it regulates diverse processes, such as secretion, phagocytosis, and cell polarization. Activity of this protein is also involved in the generation of reactive oxygen species. Mutations in this gene are associated with neutrophil immunodeficiency syndrome. There is a pseudogene for this gene on chromosome 6. | NA |
| 55010 | PARPBP | ENSG00000185480 | PARP1 binding protein | NA | NA |
| 346653 | FAM71F2 | ENSG00000205085 | family with sequence similarity 71 member F2 | NA | NA |
| 5819 | NECTIN2 | ENSG00000130202 | nectin cell adhesion molecule 2 | This gene encodes a single-pass type I membrane glycoprotein with two Ig-like C2-type domains and an Ig-like V-type domain. This protein is one of the plasma membrane components of adherens junctions. It also serves as an entry for certain mutant strains of herpes simplex virus and pseudorabies virus, and it is involved in cell to cell spreading of these viruses. Variations in this gene have been associated with differences in the severity of multiple sclerosis. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. | NA |
| 10635 | RAD51AP1 | ENSG00000111247 | RAD51 associated protein 1 | NA | NA |
| 5347 | PLK1 | ENSG00000166851 | polo like kinase 1 | The Ser/Thr protein kinase encoded by this gene belongs to the CDC5/Polo subfamily. It is highly expressed during mitosis and elevated levels are found in many different types of cancer. Depletion of this protein in cancer cells dramatically inhibited cell proliferation and induced apoptosis; hence, it is a target for cancer therapy. | NA |
| 816 | CAMK2B | ENSG00000058404 | calcium/calmodulin dependent protein kinase II beta | The product of this gene belongs to the serine/threonine protein kinase family and to the Ca(2+)/calmodulin-dependent protein kinase subfamily. Calcium signaling is crucial for several aspects of plasticity at glutamatergic synapses. In mammalian cells, the enzyme is composed of four different chains: alpha, beta, gamma, and delta. The product of this gene is a beta chain. It is possible that distinct isoforms of this chain have different cellular localizations and interact differently with calmodulin. Alternative splicing results in multiple transcript variants. | NA |
| 1775 | DNASE1L2 | ENSG00000167968 | deoxyribonuclease I-like 2 | NA | NA |
| 962 | CD48 | ENSG00000117091 | CD48 molecule | This gene encodes a member of the CD2 subfamily of immunoglobulin-like receptors which includes SLAM (signaling lymphocyte activation molecules) proteins. The encoded protein is found on the surface of lymphocytes and other immune cells, dendritic cells and endothelial cells, and participates in activation and differentiation pathways in these cells. The encoded protein does not have a transmembrane domain, however, but is held at the cell surface by a GPI anchor via a C-terminal domain which maybe cleaved to yield a soluble form of the receptor. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| 56992 | KIF15 | ENSG00000163808 | kinesin family member 15 | NA | NA |
| 890 | CCNA2 | ENSG00000145386 | cyclin A2 | The protein encoded by this gene belongs to the highly conserved cyclin family, whose members are characterized by a dramatic periodicity in protein abundance through the cell cycle. Cyclins function as regulators of CDK kinases. Different cyclins exhibit distinct expression and degradation patterns which contribute to the temporal coordination of each mitotic event. In contrast to cyclin A1, which is present only in germ cells, this cyclin is expressed in all tissues tested. This cyclin binds and activates CDC2 or CDK2 kinases, and thus promotes both cell cycle G1/S and G2/M transitions. | NA |
| 249 | ALPL | ENSG00000162551 | alkaline phosphatase, liver/bone/kidney | This gene encodes a member of the alkaline phosphatase family of proteins. There are at least four distinct but related alkaline phosphatases: intestinal, placental, placental-like, and liver/bone/kidney (tissue non-specific). The first three are located together on chromosome 2, while the tissue non-specific form is located on chromosome 1. The product of this gene is a membrane bound glycosylated enzyme that is not expressed in any particular tissue and is, therefore, referred to as the tissue-nonspecific form of the enzyme. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature enzyme. This enzyme may play a role in bone mineralization. Mutations in this gene have been linked to hypophosphatasia, a disorder that is characterized by hypercalcemia and skeletal defects. | NA |
| 4879 | NPPB | ENSG00000120937 | natriuretic peptide B | This gene is a member of the natriuretic peptide family and encodes a secreted protein which functions as a cardiac hormone. The protein undergoes two cleavage events, one within the cell and a second after secretion into the blood. The protein’s biological actions include natriuresis, diuresis, vasorelaxation, inhibition of renin and aldosterone secretion, and a key role in cardiovascular homeostasis. A high concentration of this protein in the bloodstream is indicative of heart failure. The protein also acts as an antimicrobial peptide with antibacterial and antifungal activity. Mutations in this gene have been associated with postmenopausal osteoporosis. | NA |
| 1917 | EEF1A2 | ENSG00000101210 | eukaryotic translation elongation factor 1 alpha 2 | This gene encodes an isoform of the alpha subunit of the elongation factor-1 complex, which is responsible for the enzymatic delivery of aminoacyl tRNAs to the ribosome. This isoform (alpha 2) is expressed in brain, heart and skeletal muscle, and the other isoform (alpha 1) is expressed in brain, placenta, lung, liver, kidney, and pancreas. This gene may be critical in the development of ovarian cancer. | NA |
| 4050 | LTB | ENSG00000227507 | lymphotoxin beta | Lymphotoxin beta is a type II membrane protein of the TNF family. It anchors lymphotoxin-alpha to the cell surface through heterotrimer formation. The predominant form on the lymphocyte surface is the lymphotoxin-alpha 1/beta 2 complex (e.g. 1 molecule alpha/2 molecules beta) and this complex is the primary ligand for the lymphotoxin-beta receptor. The minor complex is lymphotoxin-alpha 2/beta 1. LTB is an inducer of the inflammatory response system and involved in normal development of lymphoid tissue. Lymphotoxin-beta isoform b is unable to complex with lymphotoxin-alpha suggesting a function for lymphotoxin-beta which is independent of lympyhotoxin-alpha. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| 23250 | ATP11A | ENSG00000068650 | ATPase phospholipid transporting 11A | The protein encoded by this gene is an integral membrane ATPase. The encoded protein is probably phosphorylated in its intermediate state and likely drives the transport of ions such as calcium across membranes. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| 1728 | NQO1 | ENSG00000181019 | NAD(P)H quinone dehydrogenase 1 | This gene is a member of the NAD(P)H dehydrogenase (quinone) family and encodes a cytoplasmic 2-electron reductase. This FAD-binding protein forms homodimers and reduces quinones to hydroquinones. This protein’s enzymatic activity prevents the one electron reduction of quinones that results in the production of radical species. Mutations in this gene have been associated with tardive dyskinesia (TD), an increased risk of hematotoxicity after exposure to benzene, and susceptibility to various forms of cancer. Altered expression of this protein has been seen in many tumors and is also associated with Alzheimer’s disease (AD). Alternate transcriptional splice variants, encoding different isoforms, have been characterized. | NA |
| 4635 | MYL4 | ENSG00000198336 | myosin light chain 4 | Myosin is a hexameric ATPase cellular motor protein. It is composed of two myosin heavy chains, two nonphosphorylatable myosin alkali light chains, and two phosphorylatable myosin regulatory light chains. This gene encodes a myosin alkali light chain that is found in embryonic muscle and adult atria. Two alternatively spliced transcript variants encoding the same protein have been found for this gene. | NA |
| 10404 | CPQ | ENSG00000104324 | carboxypeptidase Q | This gene encodes a metallopeptidase that belongs to the peptidase M28 family. The encoded protein may catalyze the cleavage of dipeptides with unsubstituted terminals into amino acids. | NA |
| 388588 | SMIM1 | ENSG00000235169 | small integral membrane protein 1 (Vel blood group) | This gene encodes a small, conserved protein that participates in red blood cell formation. The encoded protein is localized to the cell membrane and is the antigen for the Vel blood group. Alternative splicing results in different transcript variants that encode the same protein. | NA |
| 100500808 | MIR3917 | ENSG00000264021 | microRNA 3917 | microRNAs (miRNAs) are short (20-24 nt) non-coding RNAs that are involved in post-transcriptional regulation of gene expression in multicellular organisms by affecting both the stability and translation of mRNAs. miRNAs are transcribed by RNA polymerase II as part of capped and polyadenylated primary transcripts (pri-miRNAs) that can be either protein-coding or non-coding. The primary transcript is cleaved by the Drosha ribonuclease III enzyme to produce an approximately 70-nt stem-loop precursor miRNA (pre-miRNA), which is further cleaved by the cytoplasmic Dicer ribonuclease to generate the mature miRNA and antisense miRNA star (miRNA*) products. The mature miRNA is incorporated into a RNA-induced silencing complex (RISC), which recognizes target mRNAs through imperfect base pairing with the miRNA and most commonly results in translational inhibition or destabilization of the target mRNA. The RefSeq represents the predicted microRNA stem-loop. | NA |
| 115123 | MARCH3 | ENSG00000173926 | membrane associated ring-CH-type finger 3 | This gene encodes a member of the membrane-associated RING-CH (MARCH) family. The encoded protein is an E3 ubiquitin-protein ligase that may be involved in regulation of the endosomal transport pathway. | NA |
| ENSG00000188985 | DHFRP1 | ENSG00000188985 | dihydrofolate reductase pseudogene 1 | NA | NA |
| NA | NA | ENSG00000237485 | NA | NA | TRUE |
| 80235 | PIGZ | ENSG00000119227 | phosphatidylinositol glycan anchor biosynthesis class Z | The glycosylphosphatidylinositol (GPI) anchor is a glycolipid found on many blood cells that serves to anchor proteins to the cell surface. This gene encodes a protein that is localized to the endoplasmic reticulum, and is involved in GPI anchor biosynthesis. As shown for the yeast homolog, which is a member of a family of dolichol-phosphate-mannose (Dol-P-Man)-dependent mannosyltransferases, this protein can also add a side-branching fourth mannose to GPI precursors during the assembly of GPI anchors. | NA |
| 79019 | CENPM | ENSG00000100162 | centromere protein M | The protein encoded by this gene is an inner protein of the kinetochore, the multi-protein complex that binds spindle microtubules to regulate chromosome segregation during cell division. It belongs to the constitutive centromere-associated network protein group, whose members interact with outer kinetochore proteins and help to maintain centromere identity at each cell division cycle. The protein is structurally related to GTPases but cannot bind guanosine triphosphate. A point mutation that affects interaction with another constitutive centromere-associated network protein, CENP-I, impairs kinetochore assembly and chromosome alignment, suggesting that it is required for kinetochore formation. Alternative splicing results in multiple transcript variants. | NA |
| ENSG00000256663 | RP11-424C20.2 | ENSG00000256663 | NA | NA | NA |
| 642280 | ZNF876P | ENSG00000198155 | zinc finger protein 876, pseudogene | NA | NA |
| 7137 | TNNI3 | ENSG00000129991 | troponin I3, cardiac type | Troponin I (TnI), along with troponin T (TnT) and troponin C (TnC), is one of 3 subunits that form the troponin complex of the thin filaments of striated muscle. TnI is the inhibitory subunit; blocking actin-myosin interactions and thereby mediating striated muscle relaxation. The TnI subfamily contains three genes: TnI-skeletal-fast-twitch, TnI-skeletal-slow-twitch, and TnI-cardiac. This gene encodes the TnI-cardiac protein and is exclusively expressed in cardiac muscle tissues. Mutations in this gene cause familial hypertrophic cardiomyopathy type 7 (CMH7) and familial restrictive cardiomyopathy (RCM). | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",4,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[5,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | query | X_id | name | summary | notfound |
|---|---|---|---|---|---|
| CRNDE | ENSG00000245694 | ENSG00000245694 | colorectal neoplasia differentially expressed (non-protein coding) | NA | NA |
| CMTM2 | ENSG00000140932 | 146225 | CKLF like MARVEL transmembrane domain containing 2 | This gene belongs to the chemokine-like factor gene superfamily, a novel family that links the chemokine and the transmembrane 4 superfamilies of signaling molecules. The protein encoded by this gene may play an important role in testicular development. | NA |
| QPRT | ENSG00000103485 | 23475 | quinolinate phosphoribosyltransferase | This gene encodes a key enzyme in catabolism of quinolinate, an intermediate in the tryptophan-nicotinamide adenine dinucleotide pathway. Quinolinate acts as a most potent endogenous exitotoxin to neurons. Elevation of quinolinate levels in the brain has been linked to the pathogenesis of neurodegenerative disorders such as epilepsy, Alzheimer’s disease, and Huntington’s disease. Alternative splicing results in multiple transcript variants. | NA |
| MANSC1 | ENSG00000111261 | 54682 | MANSC domain containing 1 | NA | NA |
| OLAH | ENSG00000152463 | 55301 | oleoyl-ACP hydrolase | NA | NA |
| BCHE | ENSG00000114200 | 590 | butyrylcholinesterase | Mutant alleles at the BCHE locus are responsible for suxamethonium sensitivity. Homozygous persons sustain prolonged apnea after administration of the muscle relaxant suxamethonium in connection with surgical anesthesia. The activity of pseudocholinesterase in the serum is low and its substrate behavior is atypical. In the absence of the relaxant, the homozygote is at no known disadvantage. | NA |
| PSD | ENSG00000059915 | 5662 | pleckstrin and Sec7 domain containing | This gene encodes a Plekstrin homology and SEC7 domains-containing protein that functions as a guanine nucleotide exchange factor. The encoded protein regulates signal transduction by activating ADP-ribosylation factor 6. Alternative splicing results in multiple transcript variants. | NA |
| GALNT14 | ENSG00000158089 | 79623 | polypeptide N-acetylgalactosaminyltransferase 14 | This gene encodes a Golgi protein which is a member of the polypeptide N-acetylgalactosaminyltransferase (ppGalNAc-Ts) protein family. These enzymes catalyze the transfer of N-acetyl-D-galactosamine (GalNAc) to the hydroxyl groups on serines and threonines in target peptides. The encoded protein has been shown to transfer GalNAc to large proteins like mucins. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| APOC1 | ENSG00000130208 | 341 | apolipoprotein C1 | This gene encodes a member of the apolipoprotein C1 family. This gene is expressed primarily in the liver, and it is activated when monocytes differentiate into macrophages. The encoded protein plays a central role in high density lipoprotein (HDL) and very low density lipoprotein (VLDL) metabolism. This protein has also been shown to inhibit cholesteryl ester transfer protein in plasma. A pseudogene of this gene is located 4 kb downstream in the same orientation, on the same chromosome. This gene is mapped to chromosome 19, where it resides within a apolipoprotein gene cluster. | NA |
| FOXC1 | ENSG00000054598 | 2296 | forkhead box C1 | This gene belongs to the forkhead family of transcription factors which is characterized by a distinct DNA-binding forkhead domain. The specific function of this gene has not yet been determined; however, it has been shown to play a role in the regulation of embryonic and ocular development. Mutations in this gene cause various glaucoma phenotypes including primary congenital glaucoma, autosomal dominant iridogoniodysgenesis anomaly, and Axenfeld-Rieger anomaly. | NA |
| RTKN | ENSG00000114993 | 6242 | rhotekin | This gene encodes a scaffold protein that interacts with GTP-bound Rho proteins. Binding of this protein inhibits the GTPase activity of Rho proteins. This protein may interfere with the conversion of active, GTP-bound Rho to the inactive GDP-bound form by RhoGAP. Rho proteins regulate many important cellular processes, including cytokinesis, transcription, smooth muscle contraction, cell growth and transformation. Dysregulation of the Rho signal transduction pathway has been implicated in many forms of cancer. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| SORT1 | ENSG00000134243 | 6272 | sortilin 1 | This gene encodes a member of the VPS10-related sortilin family of proteins. The encoded preproprotein is proteolytically processed by furin to generate the mature receptor. This receptor plays a role in the trafficking of different proteins to either the cell surface, or subcellular compartments such as lysosomes and endosomes. Expression levels of this gene may influence the risk of myocardial infarction in human patients. Alternative splicing results in multiple transcript variants. | NA |
| CCDC112 | ENSG00000164221 | 153733 | coiled-coil domain containing 112 | NA | NA |
| KCNAB1 | ENSG00000169282 | 7881 | potassium voltage-gated channel subfamily A member regulatory beta subunit 1 | Potassium channels represent the most complex class of voltage-gated ion channels from both functional and structural standpoints. Their diverse functions include regulating neurotransmitter release, heart rate, insulin secretion, neuronal excitability, epithelial electrolyte transport, smooth muscle contraction, and cell volume. Four sequence-related potassium channel genes - shaker, shaw, shab, and shal - have been identified in Drosophila, and each has been shown to have human homolog(s). This gene encodes a member of the potassium channel, voltage-gated, shaker-related subfamily. This member includes distinct isoforms which are encoded by alternatively spliced transcript variants of this gene. Some of these isoforms are beta subunits, which form heteromultimeric complexes with alpha subunits and modulate the activity of the pore-forming alpha subunits. | NA |
| RARRES3 | ENSG00000133321 | 5920 | retinoic acid receptor responder 3 | Retinoids exert biologic effects such as potent growth inhibitory and cell differentiation activities and are used in the treatment of hyperproliferative dermatological diseases. These effects are mediated by specific nuclear receptor proteins that are members of the steroid and thyroid hormone receptor superfamily of transcriptional regulators. RARRES1, RARRES2, and RARRES3 are genes whose expression is upregulated by the synthetic retinoid tazarotene. RARRES3 is thought act as a tumor suppressor or growth regulator. | NA |
| RP3-342P20.2 | ENSG00000228477 | ENSG00000228477 | NA | NA | NA |
| DRICH1 | ENSG00000189269 | 51233 | aspartate rich 1 | NA | NA |
| CAMK2N1 | ENSG00000162545 | 55450 | calcium/calmodulin dependent protein kinase II inhibitor 1 | NA | NA |
| PLBD1 | ENSG00000121316 | 79887 | phospholipase B domain containing 1 | NA | NA |
| NA | ENSG00000180672 | NA | NA | NA | TRUE |
| AC005339.2 | ENSG00000268565 | ENSG00000268565 | NA | NA | NA |
| LIF | ENSG00000128342 | 3976 | leukemia inhibitory factor | The protein encoded by this gene is a pleiotropic cytokine with roles in several different systems. It is involved in the induction of hematopoietic differentiation in normal and myeloid leukemia cells, induction of neuronal cell differentiation, regulator of mesenchymal to epithelial conversion during kidney development, and may also have a role in immune tolerance at the maternal-fetal interface. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| NRP2 | ENSG00000118257 | 8828 | neuropilin 2 | This gene encodes a member of the neuropilin family of receptor proteins. The encoded transmembrane protein binds to SEMA3C protein {sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3C} and SEMA3F protein {sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3F}, and interacts with vascular endothelial growth factor (VEGF). This protein may play a role in cardiovascular development, axon guidance, and tumorigenesis. Multiple transcript variants encoding distinct isoforms have been identified for this gene. | NA |
| ACRBP | ENSG00000111644 | 84519 | acrosin binding protein | The protein encoded by this gene is similar to proacrosin binding protein sp32 precursor found in mouse, guinea pig, and pig. This protein is located in the sperm acrosome and is thought to function as a binding protein to proacrosin for packaging and condensation of the acrosin zymogen in the acrosomal matrix. This protein is a member of the cancer/testis family of antigens and it is found to be immunogenic. In normal tissues, this mRNA is expressed only in testis, whereas it is detected in a range of different tumor types such as bladder, breast, lung, liver, and colon. | NA |
| RP11-1143G9.4 | ENSG00000257764 | ENSG00000257764 | NA | NA | NA |
| LYZ | ENSG00000090382 | 4069 | lysozyme | This gene encodes human lysozyme, whose natural substrate is the bacterial cell wall peptidoglycan (cleaving the beta[1-4]glycosidic linkages between N-acetylmuramic acid and N-acetylglucosamine). Lysozyme is one of the antimicrobial agents found in human milk, and is also present in spleen, lung, kidney, white blood cells, plasma, saliva, and tears. The protein has antibacterial activity against a number of bacterial species. Missense mutations in this gene have been identified in heritable renal amyloidosis. | NA |
| HIC1 | ENSG00000177374 | 3090 | hypermethylated in cancer 1 | This gene functions as a growth regulatory and tumor repressor gene. Hypermethylation or deletion of the region of this gene have been associated with tumors and the contiguous-gene syndrome, Miller-Dieker syndrome. Alternative splicing of this gene results in multiple transcript variants. | NA |
| PSAT1 | ENSG00000135069 | 29968 | phosphoserine aminotransferase 1 | This gene encodes a member of the class-V pyridoxal-phosphate-dependent aminotransferase family. The encoded protein is a phosphoserine aminotransferase and decreased expression may be associated with schizophrenia. Mutations in this gene are also associated with phosphoserine aminotransferase deficiency. Alternative splicing results in multiple transcript variants. Pseudogenes of this gene have been defined on chromosomes 1, 3, and 8. | NA |
| PARP14 | ENSG00000173193 | 54625 | poly(ADP-ribose) polymerase family member 14 | Poly(ADP-ribosyl)ation is an immediate DNA damage-dependent posttranslational modification of histones and other nuclear proteins that contributes to the survival of injured proliferating cells. PARP14 belongs to the superfamily of enzymes that perform this modification (Ame et al., 2004 [PubMed 15273990]). | NA |
| MAML3 | ENSG00000196782 | 55534 | mastermind like transcriptional coactivator 3 | NA | NA |
| ELN | ENSG00000049540 | 2006 | elastin | This gene encodes a protein that is one of the two components of elastic fibers. The encoded protein is rich in hydrophobic amino acids such as glycine and proline, which form mobile hydrophobic regions bounded by crosslinks between lysine residues. Deletions and mutations in this gene are associated with supravalvular aortic stenosis (SVAS) and autosomal dominant cutis laxa. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| TUBB2B | ENSG00000137285 | 347733 | tubulin beta 2B class IIb | The protein encoded by this gene is a beta isoform of tubulin, which binds GTP and is a major component of microtubules. This gene is highly similar to TUBB2A and TUBB2C. Defects in this gene are a cause of asymmetric polymicrogyria. | NA |
| SLC17A9 | ENSG00000101194 | 63910 | solute carrier family 17 member 9 | This gene encodes a member of a family of transmembrane proteins that are involved in the transport of small molecules. The encoded protein participates in the vesicular uptake, storage, and secretion of adenoside triphosphate (ATP) and other nucleotides. A mutation in this gene was found in individuals with autosomal dominant disseminated superficial actinic porokeratosis-8. Alternative splicing results in multiple transcript variants. | NA |
| CEND1 | ENSG00000184524 | 51286 | cell cycle exit and neuronal differentiation 1 | The protein encoded by this gene is a neuron-specific protein. The similar protein in pig enhances neuroblastoma cell differentiation in vitro and may be involved in neuronal differentiation in vivo. Multiple pseudogenes have been reported for this gene. | NA |
| RP11-327P2.5 | ENSG00000231856 | ENSG00000231856 | NA | NA | NA |
| TNFAIP8L1 | ENSG00000185361 | 126282 | TNF alpha induced protein 8 like 1 | NA | NA |
| OAF | ENSG00000184232 | 220323 | out at first homolog | NA | NA |
| STMN2 | ENSG00000104435 | 11075 | stathmin 2 | This gene encodes a member of the stathmin family of phosphoproteins. Stathmin proteins function in microtubule dynamics and signal transduction. The encoded protein plays a regulatory role in neuronal growth and is also thought to be involved in osteogenesis. Reductions in the expression of this gene have been associated with Down’s syndrome and Alzheimer’s disease. Alternatively spliced transcript variants have been observed for this gene. A pseudogene of this gene is located on the long arm of chromosome 6. | NA |
| FLRT1 | ENSG00000126500 | 23769 | fibronectin leucine rich transmembrane protein 1 | This gene encodes a member of the fibronectin leucine rich transmembrane protein (FLRT) family. The family members may function in cell adhesion and/or receptor signalling. Their protein structures resemble small leucine-rich proteoglycans found in the extracellular matrix. The encoded protein shares sequence similarity with two other family members, FLRT2 and FLRT3. This gene is expressed in kidney and brain. | NA |
| CD22 | ENSG00000012124 | 933 | CD22 molecule | NA | NA |
| STX3 | ENSG00000166900 | 6809 | syntaxin 3 | The gene is a member of the syntaxin family. The encoded protein is targeted to the apical membrane of epithelial cells where it forms clusters and is important in establishing and maintaining polarity necessary for protein trafficking involving vesicle fusion and exocytosis. Alternative splicing results in multiple transcript variants. | NA |
| H2AFJ | ENSG00000246705 | 55766 | H2A histone family member J | Histones are basic nuclear proteins that are responsible for the nucleosome structure of the chromosomal fiber in eukaryotes. Nucleosomes consist of approximately 146 bp of DNA wrapped around a histone octamer composed of pairs of each of the four core histones (H2A, H2B, H3, and H4). The chromatin fiber is further compacted through the interaction of a linker histone, H1, with the DNA between the nucleosomes to form higher order chromatin structures. This gene is located on chromosome 12 and encodes a replication-independent histone that is a variant H2A histone. The protein is divergent at the C-terminus compared to the consensus H2A histone family member. This gene also encodes an antimicrobial peptide with antibacterial and antifungal activity. | NA |
| DERL3 | ENSG00000099958 | 91319 | derlin 3 | The protein encoded by this gene belongs to the derlin family, and resides in the endoplasmic reticulum (ER). Proteins that are unfolded or misfolded in the ER must be refolded or degraded to maintain the homeostasis of the ER. This protein appears to be involved in the degradation of misfolded glycoproteins in the ER. Several alternatively spliced transcript variants encoding different isoforms have been identified for this gene. | NA |
| TNF | ENSG00000232810 | 7124 | tumor necrosis factor | This gene encodes a multifunctional proinflammatory cytokine that belongs to the tumor necrosis factor (TNF) superfamily. This cytokine is mainly secreted by macrophages. It can bind to, and thus functions through its receptors TNFRSF1A/TNFR1 and TNFRSF1B/TNFBR. This cytokine is involved in the regulation of a wide spectrum of biological processes including cell proliferation, differentiation, apoptosis, lipid metabolism, and coagulation. This cytokine has been implicated in a variety of diseases, including autoimmune diseases, insulin resistance, and cancer. Knockout studies in mice also suggested the neuroprotective function of this cytokine. | NA |
| ICAM1 | ENSG00000090339 | 3383 | intercellular adhesion molecule 1 | This gene encodes a cell surface glycoprotein which is typically expressed on endothelial cells and cells of the immune system. It binds to integrins of type CD11a / CD18, or CD11b / CD18 and is also exploited by Rhinovirus as a receptor. | NA |
| CCNJL | ENSG00000135083 | 79616 | cyclin J like | NA | NA |
| MFSD2A | ENSG00000168389 | 84879 | major facilitator superfamily domain containing 2A | NA | NA |
| RSPH9 | ENSG00000172426 | 221421 | radial spoke head 9 homolog | This gene encodes a protein thought to be a component of the radial spoke head in motile cilia and flagella. Mutations in this gene are associated with primary ciliary dyskinesia 12. Alternative splicing results in multiple transcript variants. | NA |
| RP11-532F6.3 | ENSG00000272463 | ENSG00000272463 | NA | NA | NA |
| OPRL1 | ENSG00000125510 | 4987 | opioid related nociceptin receptor 1 | The protein encoded by this gene is a member of the 7 transmembrane-spanning G protein-coupled receptor family, and functions as a receptor for the endogenous, opioid-related neuropeptide, nociceptin/orphanin FQ. This receptor-ligand system modulates a variety of biological functions and neurobehavior, including stress responses and anxiety behavior, learning and memory, locomotor activity, and inflammatory and immune responses. A promoter region between this gene and the 5’-adjacent RGS19 (regulator of G-protein signaling 19) gene on the opposite strand functions bi-directionally as a core-promoter for both genes, suggesting co-operative transcriptional regulation of these two functionally related genes. Alternatively spliced transcript variants have been described for this gene. A recent study provided evidence for translational readthrough in this gene and expression of an additional C-terminally extended isoform via the use of an alternative in-frame translation termination codon. | NA |
| AC114730.2 | ENSG00000235151 | ENSG00000235151 | NA | NA | NA |
| AMN1 | ENSG00000151743 | 196394 | antagonist of mitotic exit network 1 homolog | NA | NA |
| PDXP | ENSG00000241360 | 57026 | pyridoxal phosphatase | Pyridoxal 5-prime-phosphate (PLP) is the active form of vitamin B6 that acts as a coenzyme in maintaining biochemical homeostasis. The preferred degradation route from PLP to 4-pyridoxic acid involves the dephosphorylation of PLP by PDXP (Jang et al., 2003 [PubMed 14522954]). | NA |
| CPE | ENSG00000109472 | 1363 | carboxypeptidase E | This gene encodes a member of the M14 family of metallocarboxypeptidases. The encoded preproprotein is proteolytically processed to generate the mature peptidase. This peripheral membrane protein cleaves C-terminal amino acid residues and is involved in the biosynthesis of peptide hormones and neurotransmitters, including insulin. This protein may also function independently of its peptidase activity, as a neurotrophic factor that promotes neuronal survival, and as a sorting receptor that binds to regulated secretory pathway proteins, including prohormones. Mutations in this gene are implicated in type 2 diabetes. | NA |
| RP1-90J20.8 | ENSG00000224846 | ENSG00000224846 | NA | NA | NA |
| KB-1572G7.3 | ENSG00000211683 | ENSG00000211683 | NA | NA | NA |
| LY6E | ENSG00000160932 | 4061 | lymphocyte antigen 6 complex, locus E | NA | NA |
| SDS | ENSG00000135094 | 10993 | serine dehydratase | This gene encodes one of three enzymes that are involved in metabolizing serine and glycine. L-serine dehydratase converts L-serine to pyruvate and ammonia and requires pyridoxal phosphate as a cofactor. The encoded protein can also metabolize threonine to NH4+ and 2-ketobutyrate. The encoded protein is found predominantly in the liver. | NA |
| CTD-2240E14.4 | ENSG00000267387 | ENSG00000267387 | NA | NA | NA |
| PPM1H | ENSG00000111110 | 57460 | protein phosphatase, Mg2+/Mn2+ dependent 1H | NA | NA |
| LOC102723927 | ENSG00000237940 | 102723927 | uncharacterized LOC102723927 | NA | NA |
| RP11-10C24.1 | ENSG00000271020 | ENSG00000271020 | NA | NA | NA |
| DNPH1 | ENSG00000112667 | 10591 | 2’-deoxynucleoside 5’-phosphate N-hydrolase 1 | This gene was identified on the basis of its stimulation by c-Myc protein. The latter is a transcription factor that participates in the regulation of cell proliferation, differentiation, and apoptosis. The exact function of this gene is not known but studies in rat suggest a role in cellular proliferation and c-Myc-mediated transformation. Two alternative transcripts encoding different proteins have been described. | NA |
| AC097724.3 | ENSG00000226833 | ENSG00000226833 | NA | NA | NA |
| RP11-21A7A.3 | ENSG00000256341 | ENSG00000256341 | NA | NA | NA |
| GBP1 | ENSG00000117228 | 2633 | guanylate binding protein 1 | Guanylate binding protein expression is induced by interferon. Guanylate binding proteins are characterized by their ability to specifically bind guanine nucleotides (GMP, GDP, and GTP) and are distinguished from the GTP-binding proteins by the presence of 2 binding motifs rather than 3. | NA |
| PIEZO1 | ENSG00000103335 | 9780 | piezo type mechanosensitive ion channel component 1 | The protein encoded by this gene is a mechanically-activated ion channel that links mechanical forces to biological signals. The encoded protein contains 36 transmembrane domains and functions as a homotetramer. Defects in this gene have been associated with dehydrated hereditary stomatocytosis. | NA |
| CYB5R4 | ENSG00000065615 | 51167 | cytochrome b5 reductase 4 | NCB5OR is a flavohemoprotein that contains functional domains found in both cytochrome b5 (CYB5A; MIM 613218) and CYB5 reductase (CYB5R3; MIM 613213) (Zhu et al., 1999 [PubMed 10611283]). | NA |
| SERPINI1 | ENSG00000163536 | 5274 | serpin family I member 1 | This gene encodes a member of the serpin superfamily of serine proteinase inhibitors. The protein is primarily secreted by axons in the brain, and preferentially reacts with and inhibits tissue-type plasminogen activator. It is thought to play a role in the regulation of axonal growth and the development of synaptic plasticity. Mutations in this gene result in familial encephalopathy with neuroserpin inclusion bodies (FENIB), which is a dominantly inherited form of familial encephalopathy and epilepsy characterized by the accumulation of mutant neuroserpin polymers. Multiple alternatively spliced variants, encoding the same protein, have been identified. | NA |
| NA | ENSG00000175898 | NA | NA | NA | TRUE |
| GALNT7 | ENSG00000109586 | 51809 | polypeptide N-acetylgalactosaminyltransferase 7 | This gene encodes GalNAc transferase 7, a member of the GalNAc-transferase family. The enzyme encoded by this gene controls the initiation step of mucin-type O-linked protein glycosylation and transfer of N-acetylgalactosamine to serine and threonine amino acid residues. This enzyme is a type II transmembrane protein and shares common sequence motifs with other family members. Unlike other family members, this enzyme shows exclusive specificity for partially GalNAc-glycosylated acceptor substrates and shows no activity with non-glycosylated peptides. This protein may function as a follow-up enzyme in the initiation step of O-glycosylation. | NA |
| LTBP2 | ENSG00000119681 | 4053 | latent transforming growth factor beta binding protein 2 | The protein encoded by this gene belongs to the family of latent transforming growth factor (TGF)-beta binding proteins (LTBP), which are extracellular matrix proteins with multi-domain structure. This protein is the largest member of the LTBP family possessing unique regions and with most similarity to the fibrillins. It has thus been suggested that it may have multiple functions: as a member of the TGF-beta latent complex, as a structural component of microfibrils, and a role in cell adhesion. | NA |
| MEGF6 | ENSG00000162591 | 1953 | multiple EGF like domains 6 | NA | NA |
| NA | ENSG00000232222 | NA | NA | NA | TRUE |
| RP11-798M19.3 | ENSG00000248774 | ENSG00000248774 | NA | NA | NA |
| RP11-799B12.2 | ENSG00000264924 | ENSG00000264924 | NA | NA | NA |
| PLXDC2 | ENSG00000120594 | 84898 | plexin domain containing 2 | NA | NA |
| ITM2B | ENSG00000136156 | 9445 | integral membrane protein 2B | Amyloid precursor proteins are processed by beta-secretase and gamma-secretase to produce beta-amyloid peptides which form the characteristic plaques of Alzheimer disease. This gene encodes a transmembrane protein which is processed at the C-terminus by furin or furin-like proteases to produce a small secreted peptide which inhibits the deposition of beta-amyloid. Mutations which result in extension of the C-terminal end of the encoded protein, thereby increasing the size of the secreted peptide, are associated with two neurogenerative diseases, familial British dementia and familial Danish dementia. | NA |
| RP11-169K16.4 | ENSG00000224459 | ENSG00000224459 | NA | NA | NA |
| GCA | ENSG00000115271 | 25801 | grancalcin | This gene product, grancalcin, is a calcium-binding protein abundant in neutrophils and macrophages. It belongs to the penta-EF-hand subfamily of proteins which includes sorcin, calpain, and ALG-2. Grancalcin localization is dependent upon calcium and magnesium. In the absence of divalent cation, grancalcin localizes to the cytosolic fraction; with magnesium alone, it partitions with the granule fraction; and in the presence of magnesium and calcium, it associates with both the granule and membrane fractions, suggesting a role for grancalcin in granule-membrane fusion and degranulation. | NA |
| MFGE8 | ENSG00000140545 | 4240 | milk fat globule-EGF factor 8 protein | This gene encodes a preproprotein that is proteolytically processed to form multiple protein products. The major encoded protein product, lactadherin, is a membrane glycoprotein that promotes phagocytosis of apoptotic cells. This protein has also been implicated in wound healing, autoimmune disease, and cancer. Lactadherin can be further processed to form a smaller cleavage product, medin, which comprises the major protein component of aortic medial amyloid (AMA). Alternative splicing results in multiple transcript variants. | NA |
| EGR2 | ENSG00000122877 | 1959 | early growth response 2 | The protein encoded by this gene is a transcription factor with three tandem C2H2-type zinc fingers. Defects in this gene are associated with Charcot-Marie-Tooth disease type 1D (CMT1D), Charcot-Marie-Tooth disease type 4E (CMT4E), and with Dejerine-Sottas syndrome (DSS). Multiple transcript variants encoding two different isoforms have been found for this gene. | NA |
| RP11-360F5.3 | ENSG00000249685 | ENSG00000249685 | NA | NA | NA |
| NOV | ENSG00000136999 | 4856 | nephroblastoma overexpressed | The protein encoded by this gene is a small secreted cysteine-rich protein and a member of the CCN family of regulatory proteins. CNN family proteins associate with the extracellular matrix and play an important role in cardiovascular and skeletal development, fibrosis and cancer development. | NA |
| VAV2 | ENSG00000160293 | 7410 | vav guanine nucleotide exchange factor 2 | VAV2 is the second member of the VAV guanine nucleotide exchange factor family of oncogenes. Unlike VAV1, which is expressed exclusively in hematopoietic cells, VAV2 transcripts were found in most tissues. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| RP11-323N12.5 | ENSG00000267601 | ENSG00000267601 | NA | NA | NA |
| NKD2 | ENSG00000145506 | 85409 | naked cuticle homolog 2 | This gene encodes a member of a family of proteins that function as negative regulators of Wnt receptor signaling through interaction with Dishevelled family members. The encoded protein participates in the delivery of transforming growth factor alpha-containing vesicles to the cell membrane. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| DYX1C1-CCPG1 | ENSG00000261771 | 100533483 | DYX1C1-CCPG1 readthrough (NMD candidate) | This locus represents naturally occurring read-through transcription between the neighboring dyslexia susceptibility 1 candidate 1 (DYX1C1) and cell cycle progression 1 (CCPG1) genes on chromosome 15. The read-through transcript is a candidate for nonsense-mediated mRNA decay (NMD), and is thus unlikely to produce a protein product. | NA |
| CTC-338M12.6 | ENSG00000250900 | ENSG00000250900 | NA | NA | NA |
| ANK1 | ENSG00000029534 | 286 | ankyrin 1 | Ankyrins are a family of proteins that link the integral membrane proteins to the underlying spectrin-actin cytoskeleton and play key roles in activities such as cell motility, activation, proliferation, contact and the maintenance of specialized membrane domains. Multiple isoforms of ankyrin with different affinities for various target proteins are expressed in a tissue-specific, developmentally regulated manner. Most ankyrins are typically composed of three structural domains: an amino-terminal domain containing multiple ankyrin repeats; a central region with a highly conserved spectrin binding domain; and a carboxy-terminal regulatory domain which is the least conserved and subject to variation. Ankyrin 1, the prototype of this family, was first discovered in the erythrocytes, but since has also been found in brain and muscles. Mutations in erythrocytic ankyrin 1 have been associated in approximately half of all patients with hereditary spherocytosis. Complex patterns of alternative splicing in the regulatory domain, giving rise to different isoforms of ankyrin 1 have been described. Truncated muscle-specific isoforms of ankyrin 1 resulting from usage of an alternate promoter have also been identified. | NA |
| GALNT3 | ENSG00000115339 | 2591 | polypeptide N-acetylgalactosaminyltransferase 3 | This gene encodes UDP-GalNAc transferase 3, a member of the GalNAc-transferases family. This family transfers an N-acetyl galactosamine to the hydroxyl group of a serine or threonine residue in the first step of O-linked oligosaccharide biosynthesis. Individual GalNAc-transferases have distinct activities and initiation of O-glycosylation is regulated by a repertoire of GalNAc-transferases. The protein encoded by this gene is highly homologous to other family members, however the enzymes have different substrate specificities. | NA |
| TRIM14 | ENSG00000106785 | 9830 | tripartite motif containing 14 | The protein encoded by this gene is a member of the tripartite motif (TRIM) family. The TRIM motif includes three zinc-binding domains, a RING, a B-box type 1 and a B-box type 2, and a coiled-coil region. The protein localizes to cytoplasmic bodies and its function has not been determined. Alternative splicing results in multiple transcript variants. | NA |
| CTB-51J22.1 | ENSG00000232415 | ENSG00000232415 | NA | NA | NA |
| STARD13 | ENSG00000133121 | 90627 | StAR related lipid transfer domain containing 13 | This gene encodes a protein which contains an N-terminal sterile alpha motif (SAM) for protein-protein interactions, followed by an ATP/GTP-binding motif, a GTPase-activating protein (GAP) domain, and a C-terminal STAR-related lipid transfer (START) domain. It may be involved in regulation of cytoskeletal reorganization, cell proliferation, and cell motility, and acts as a tumor suppressor in hepatoma cells. The gene is located in a region of chromosome 13 that is associated with loss of heterozygosity in hepatocellular carcinomas. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | NA |
| RP11-10C24.3 | ENSG00000271643 | ENSG00000271643 | NA | NA | NA |
| SERPINB6 | ENSG00000124570 | 5269 | serpin family B member 6 | The protein encoded by this gene is a member of the serpin (serine proteinase inhibitor) superfamily, and ovalbumin(ov)-serpin subfamily. It was originally discovered as a placental thrombin inhibitor. The mouse homolog was found to be expressed in the hair cells of the inner ear. Mutations in this gene are associated with nonsyndromic progressive hearing loss, suggesting that this serpin plays an important role in the inner ear in the protection against leakage of lysosomal content during stress, and that loss of this protection results in cell death and sensorineural hearing loss. Alternatively spliced transcript variants have been found for this gene. | NA |
| S100A11 | ENSG00000163191 | 6282 | S100 calcium binding protein A11 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in motility, invasion, and tubulin polymerization. Chromosomal rearrangements and altered expression of this gene have been implicated in tumor metastasis. | NA |
| RP5-1142A6.9 | ENSG00000260121 | ENSG00000260121 | NA | NA | NA |
| MMP17 | ENSG00000198598 | 4326 | matrix metallopeptidase 17 | This gene encodes a member of the peptidase M10 family and membrane-type subfamily of matrix metalloproteinases (MMPs). Proteins in this family are involved in the breakdown of extracellular matrix in normal physiological processes, such as embryonic development, reproduction, and tissue remodeling, as well as in disease processes, such as arthritis and metastasis. Members of this subfamily contain a transmembrane domain suggesting that these proteins are expressed at the cell surface rather than secreted. The encoded preproprotein is proteolytically processed to generate the mature protease. This protein is unique among the membrane-type matrix metalloproteinases in that it is anchored to the cell membrane via a glycosylphosphatidylinositol (GPI) anchor. Elevated expression of the encoded protein has been observed in osteoarthritis and multiple human cancers. | NA |
| IGF1R | ENSG00000140443 | 3480 | insulin like growth factor 1 receptor | This receptor binds insulin-like growth factor with a high affinity. It has tyrosine kinase activity. The insulin-like growth factor I receptor plays a critical role in transformation events. Cleavage of the precursor generates alpha and beta subunits. It is highly overexpressed in most malignant tissues where it functions as an anti-apoptotic agent by enhancing cell survival. Alternatively spliced transcript variants encoding distinct isoforms have been found for this gene. | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",5,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[6,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| summary | query | name | X_id | symbol | notfound |
|---|---|---|---|---|---|
| The galectins are a family of beta-galactoside-binding proteins implicated in modulating cell-cell and cell-matrix interactions. The expression of this gene is restricted to small intestine, colon, and rectum, and it is underexpressed in colorectal cancer. | ENSG00000171747 | galectin 4 | 3960 | LGALS4 | NA |
| NA | ENSG00000165862 | NA | NA | NA | TRUE |
| The protein encoded by this gene belongs to the ZIP family of zinc transporters that transport zinc into cells from outside, and play a crucial role in controlling intracellular zinc levels. Zinc is an essential cofactor for many enzymes and proteins involved in gene transcription, growth, development and differentiation. Mutations in this gene have been associated with autosomal dominant high myopia (MYP24). Alternatively spliced transcript variants have been found for this gene. | ENSG00000139540 | solute carrier family 39 member 5 | 283375 | SLC39A5 | NA |
| This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | ENSG00000115386 | regenerating family member 1 alpha | 5967 | REG1A | NA |
| This gene encodes a member of the steroid-thyroid hormone-retinoid receptor superfamily. The encoded protein may act as a transcriptional activator. The protein can efficiently bind the NGFI-B Response Element (NBRE). Three different versions of extraskeletal myxoid chondrosarcomas (EMCs) are the result of reciprocal translocations between this gene and other genes. The translocation breakpoints are associated with Nuclear Receptor Subfamily 4, Group A, Member 3 (on chromosome 9) and either Ewing Sarcome Breakpoint Region 1 (on chromosome 22), RNA Polymerase II, TATA Box-Binding Protein-Associated Factor, 68-KD (on chromosome 17), or Transcription factor 12 (on chromosome 15). Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000119508 | nuclear receptor subfamily 4 group A member 3 | 8013 | NR4A3 | NA |
| This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is expressed in the brain and pancreas and is resistant to common trypsin inhibitors. It is active on peptide linkages involving the carboxyl group of lysine or arginine. This gene is localized to the locus of T cell receptor beta variable orphans on chromosome 9. Four transcript variants encoding different isoforms have been described for this gene. | ENSG00000010438 | protease, serine 3 | 5646 | PRSS3 | NA |
| This gene encodes a protein precursor of the digestive enzyme pepsin, a member of the peptidase A1 family of endopeptidases. The encoded precursor is secreted by gastric chief cells and undergoes autocatalytic cleavage in acidic conditions to form the active enzyme, which functions in the digestion of dietary proteins. This gene is found in a cluster of related genes on chromosome 11, each of which encodes one of multiple pepsinogens. Pepsinogen levels in serum may serve as a biomarker for atrophic gastritis and gastric cancer. | ENSG00000229859 | pepsinogen 3, group I (pepsinogen A) | 643834 | PGA3 | NA |
| The protein encoded by this gene belongs to the ‘regulator of G protein signaling’ family. It inhibits signal transduction by increasing the GTPase activity of G protein alpha subunits. It also may play a role in regulating the kinetics of signaling in the phototransduction cascade. | ENSG00000143333 | regulator of G-protein signaling 16 | 6004 | RGS16 | NA |
| This gene encodes a member of the muscle segment homeobox gene family. The encoded protein functions as a transcriptional repressor during embryogenesis through interactions with components of the core transcription complex and other homeoproteins. It may also have roles in limb-pattern formation, craniofacial development, particularly odontogenesis, and tumor growth inhibition. Mutations in this gene, which was once known as homeobox 7, have been associated with nonsyndromic cleft lip with or without cleft palate 5, Witkop syndrome, Wolf-Hirschom syndrome, and autosomoal dominant hypodontia. | ENSG00000163132 | msh homeobox 1 | 4487 | MSX1 | NA |
| This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV based on the primary structures of the encoded proteins. This gene encodes a protein secreted by the exocrine pancreas that is highly similar to the REG1A protein. The related REG1A protein is associated with islet cell regeneration and diabetogenesis, and may be involved in pancreatic lithogenesis. Reg family members REG1A, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | ENSG00000172023 | regenerating family member 1 beta | 5968 | REG1B | NA |
| The protein encoded by this gene is a trypsin inhibitor, which is secreted from pancreatic acinar cells into pancreatic juice. It is thought to function in the prevention of trypsin-catalyzed premature activation of zymogens within the pancreas and the pancreatic duct. Mutations in this gene are associated with hereditary pancreatitis and tropical calcific pancreatitis. | ENSG00000164266 | serine peptidase inhibitor, Kazal type 1 | 6690 | SPINK1 | NA |
| This gene encodes a pancreatic secretory protein that may be involved in cell proliferation or differentiation. It has similarity to the C-type lectin superfamily. The enhanced expression of this gene is observed during pancreatic inflammation and liver carcinogenesis. The mature protein also functions as an antimicrobial protein with antibacterial activity. Alternate splicing results in multiple transcript variants that encode the same protein. | ENSG00000172016 | regenerating family member 3 alpha | 5068 | REG3A | NA |
| This gene encodes a member of the syntaxin family. Syntaxins have been implicated in the targeting and fusion of intracellular transport vesicles. This family member may regulate protein transport among late endosomes and the trans-Golgi network. Mutations in this gene have been associated with familial hemophagocytic lymphohistiocytosis. | ENSG00000135604 | syntaxin 11 | 8676 | STX11 | NA |
| This gene encodes a protein containing coiled-coil domains. The encoded protein functions in outer dynein arm assembly and is required for motile cilia function. Mutations in this gene result in primary ciliary dyskinesia. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000198003 | coiled-coil domain containing 151 | 115948 | CCDC151 | NA |
| NA | ENSG00000139572 | G protein-coupled receptor 84 | 53831 | GPR84 | NA |
| NA | ENSG00000157315 | transmembrane p24 trafficking protein 6 | 146456 | TMED6 | NA |
| This gene encodes a type I membrane glycoprotein containing two extracellular immunoglobulin domains, a transmembrane and a cytoplasmic domain. This gene is expressed by various cell types, including B cells, a subset of T cells, thymocytes, endothelial cells, and neurons. The encoded protein plays an important role in immunosuppression and regulation of anti-tumor activity. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000091972 | CD200 molecule | 4345 | CD200 | NA |
| NA | ENSG00000211454 | aldo-keto reductase family 7-like (gene/pseudogene) | ENSG00000211454 | AKR7L | NA |
| NA | ENSG00000108932 | solute carrier family 16 member 6 | 9120 | SLC16A6 | NA |
| Aldo-keto reductases, such as AKR7A3, are involved in the detoxification of aldehydes and ketones. | ENSG00000162482 | aldo-keto reductase family 7 member A3 | 22977 | AKR7A3 | NA |
| NA | ENSG00000131094 | complement component 1, q subcomponent-like 1 | 10882 | C1QL1 | NA |
| This gene encodes a member of the low density lipoprotein receptor (LDLR) family. Low density lipoprotein receptors are cell surface proteins that play roles in both signal transduction and receptor-mediated endocytosis of specific ligands for lysosomal degradation. The encoded protein plays a critical role in the migration of neurons during development by mediating Reelin signaling, and also functions as a receptor for the cholesterol transport protein apolipoprotein E. Expression of this gene may be a marker for major depressive disorder. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | ENSG00000157193 | LDL receptor related protein 8 | 7804 | LRP8 | NA |
| This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. The encoded protein metabolizes drugs as well as the steroid hormones testosterone and progesterone. This gene is part of a cluster of cytochrome P450 genes on chromosome 7q21.1. Two pseudogenes of this gene have been identified within this cluster on chromosome 7. Expression of this gene is widely variable among populations, and a single nucleotide polymorphism that affects transcript splicing has been associated with susceptibility to hypertensions. Alternative splicing results in multiple transcript variants. | ENSG00000106258 | cytochrome P450 family 3 subfamily A member 5 | 1577 | CYP3A5 | NA |
| The protein encoded by this gene is found to be down-regulated in human gastric cancer tissue as compared to normal gastric mucosa. | ENSG00000169605 | gastrokine 1 | 56287 | GKN1 | NA |
| Metabolic N-oxidation of the diet-derived amino-trimethylamine (TMA) is mediated by flavin-containing monooxygenase and is subject to an inherited FMO3 polymorphism in man resulting in a small subpopulation with reduced TMA N-oxidation capacity resulting in fish odor syndrome Trimethylaminuria. Three forms of the enzyme, FMO1 found in fetal liver, FMO2 found in adult liver, and FMO3 are encoded by genes clustered in the 1q23-q25 region. Flavin-containing monooxygenases are NADPH-dependent flavoenzymes that catalyzes the oxidation of soft nucleophilic heteroatom centers in drugs, pesticides, and xenobiotics. Alternative splicing results in multiple transcript variants. | ENSG00000131781 | flavin containing monooxygenase 5 | 2330 | FMO5 | NA |
| NA | ENSG00000244124 | ATP1B3 antisense RNA 1 | ENSG00000244124 | ATP1B3-AS1 | NA |
| NA | ENSG00000230701 | F-box and WD repeat domain containing 4 pseudogene 1 | 26226 | FBXW4P1 | NA |
| This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. | ENSG00000169347 | glycoprotein 2 | 2813 | GP2 | NA |
| This gene encodes gastric lipase, an enzyme involved in the digestion of dietary triglycerides in the gastrointestinal tract, and responsible for 30% of fat digestion processes occurring in human. It is secreted by gastric chief cells in the fundic mucosa of the stomach, and it hydrolyzes the ester bonds of triglycerides under acidic pH conditions. The gene is a member of a conserved gene family of lipases that play distinct roles in neutral lipid metabolism. Several transcript variants encoding different isoforms have been found for this gene. | ENSG00000182333 | lipase F, gastric type | 8513 | LIPF | NA |
| NA | ENSG00000214193 | SH3 domain containing 21 | 79729 | SH3D21 | NA |
| This gene was originally cloned from human myeloblasts and found to be selectively expressed in inflammed colonic epithelium. This gene encodes a member of the olfactomedin family. The encoded protein is an antiapoptotic factor that promotes tumor growth and is an extracellular matrix glycoprotein that facilitates cell adhesion. | ENSG00000102837 | olfactomedin 4 | 10562 | OLFM4 | NA |
| NA | ENSG00000172738 | transmembrane protein 217 | 221468 | TMEM217 | NA |
| NA | ENSG00000099625 | CACN beta subunit associated regulatory protein | 255057 | CBARP | NA |
| This gene encodes a member of the Ras superfamily of small GTPases and is induced by dexamethasone. The encoded protein is an activator of G-protein signaling and acts as a direct nucleotide exchange factor for Gi-Go proteins. This protein interacts with the neuronal nitric oxide adaptor protein CAPON, and a nuclear adaptor protein FE65, which interacts with the Alzheimer’s disease amyloid precursor protein. This gene may play a role in dexamethasone-induced alterations in cell morphology, growth and cell-extracellular matrix interactions. Epigenetic inactivation of this gene is closely correlated with resistance to dexamethasone in multiple myeloma cells. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | ENSG00000108551 | ras related dexamethasone induced 1 | 51655 | RASD1 | NA |
| The protein encoded by this gene belongs to the innexin family. Innexin family members are the structural components of gap junctions. This protein and pannexin 1 are abundantly expressed in central nervous system (CNS) and are coexpressed in various neuronal populations. Studies in Xenopus oocytes suggest that this protein alone and in combination with pannexin 1 may form cell type-specific gap junctions with distinct properties. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000073150 | pannexin 2 | 56666 | PANX2 | NA |
| NA | ENSG00000010282 | hedgehog acyltransferase-like | 57467 | HHATL | NA |
| This protein encoded by this gene belongs to the WD repeat-containing family of proteins, which function in the formation of protein-protein complexes in a variety of biological pathways. This family member appears to function in the determination of mean platelet volume (MPV), and polymorphisms in this gene have been associated with variance in MPV. Alternative splicing of this gene results in multiple transcript variants. | ENSG00000158023 | WD repeat domain 66 | 144406 | WDR66 | NA |
| NA | ENSG00000237188 | NA | ENSG00000237188 | RP11-337C18.8 | NA |
| NA | ENSG00000149564 | endothelial cell adhesion molecule | 90952 | ESAM | NA |
| This gene encodes a member of the steroid-thyroid hormone-retinoid receptor superfamily. Expression is induced by phytohemagglutinin in human lymphocytes and by serum stimulation of arrested fibroblasts. The encoded protein acts as a nuclear transcription factor. Translocation of the protein from the nucleus to mitochondria induces apoptosis. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000123358 | nuclear receptor subfamily 4 group A member 1 | 3164 | NR4A1 | NA |
| The Fos gene family consists of 4 members: FOS, FOSB, FOSL1, and FOSL2. These genes encode leucine zipper proteins that can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1. As such, the FOS proteins have been implicated as regulators of cell proliferation, differentiation, and transformation. Several transcript variants encoding different isoforms have been found for this gene. | ENSG00000175592 | FOS like 1, AP-1 transcription factor subunit | 8061 | FOSL1 | NA |
| This gene encodes a member of the selenium-binding protein family. Selenium is an essential nutrient that exhibits potent anticarcinogenic properties, and deficiency of selenium may cause certain neurologic diseases. The effects of selenium in preventing cancer and neurologic diseases may be mediated by selenium-binding proteins, and decreased expression of this gene may be associated with several types of cancer. The encoded protein may play a selenium-dependent role in ubiquitination/deubiquitination-mediated protein degradation. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | ENSG00000143416 | selenium binding protein 1 | 8991 | SELENBP1 | NA |
| NA | ENSG00000108187 | phenazine biosynthesis like protein domain containing | 64081 | PBLD | NA |
| This gene encodes a zinc finger protein containing a KRAB (Kruppel-associated box) domain found in transcriptional repressors. This gene may be methylated and silenced in cancer cells. This gene is located within a differentially methylated region (DMR) and shows allele-specific expression in placenta. Alternative splicing and the use of alternative promoters results in multiple transcript variants encoding the same protein. | ENSG00000130844 | zinc finger protein 331 | 55422 | ZNF331 | NA |
| The protein encoded by this gene has substantial phospholipase activity and may be involved in lipoprotein metabolism and vascular biology. This protein is designated a member of the TG lipase family by its sequence and characteristic lid region which provides substrate specificity for enzymes of the TG lipase family. | ENSG00000101670 | lipase G, endothelial type | 9388 | LIPG | NA |
| NA | ENSG00000230280 | heterogeneous nuclear ribonucleoprotein A1 pseudogene 59 | ENSG00000230280 | HNRNPA1P59 | NA |
| This gene encodes a member of the semaphorin family of soluble and transmembrane proteins. Semaphorins are involved in numerous functions, including axon guidance, morphogenesis, carcinogenesis, and immunomodulation. The encoded protein is a single-pass type I membrane protein containing an immunoglobulin-like C2-type domain, a PSI domain and a sema domain. It inhibits axonal extension by providing local signals to specify territories inaccessible for growing axons. It is an activator of T-cell-mediated immunity and suppresses vascular endothelial growth factor (VEGF)-mediated endothelial cell migration and proliferation in vitro and angiogenesis in vivo. Mutations in this gene are associated with retinal degenerative diseases including retinitis pigmentosa type 35 (RP35) and cone-rod dystrophy type 10 (CORD10). Multiple alternatively spliced transcript variants encoding different isoforms have been identified. | ENSG00000196189 | semaphorin 4A | 64218 | SEMA4A | NA |
| This gene encodes a serine/threonine protein kinase. Although this gene product is similar to serum- and glucocorticoid-induced protein kinase (SGK), this gene is not induced by serum or glucocorticoids. This gene is induced in response to signals that activate phosphatidylinositol 3-kinase, which is also true for SGK. Alternative splicing results in multiple transcript variants. | ENSG00000101049 | SGK2, serine/threonine kinase 2 | 10110 | SGK2 | NA |
| NA | ENSG00000271769 | NA | NA | NA | TRUE |
| NA | ENSG00000250606 | NA | NA | NA | TRUE |
| The protein encoded by this gene is a member of the fibroblast growth factor (FGF) family. FGF family members possess broad mitogenic and cell survival activities, and are involved in a variety of biological processes, including embryonic development, cell growth, morphogenesis, tissue repair, tumor growth and invasion. The function of this gene has not yet been determined. The expression pattern of the mouse homolog implies a role in nervous system development. Alternative splicing results in multiple transcript variants. | ENSG00000161958 | fibroblast growth factor 11 | 2256 | FGF11 | NA |
| The protein encoded by this gene is a homeodomain protein that lacks certain conserved residues required for DNA binding. It was reported that choriocarcinoma cell lines and tissues failed to express this gene, which suggested the possible involvement of this gene in malignant conversion of placental trophoblasts. Studies in mice suggest that this protein may interact with serum response factor (SRF) and modulate SRF-dependent cardiac-specific gene expression and cardiac development. Multiple alternatively spliced transcript variants have been identified for this gene. | ENSG00000171476 | HOP homeobox | 84525 | HOPX | NA |
| NA | ENSG00000226445 | uncharacterized LOC101929523 | 101929523 | LOC101929523 | NA |
| NA | ENSG00000168490 | phytanoyl-CoA 2-hydroxylase interacting protein | 9796 | PHYHIP | NA |
| Three different forms of human pancreatic procarboxypeptidase A have been isolated. The encoded protein represents the A2 form, which is a monomeric protein with different biochemical properties from the A1 and A3 forms. The A2 form of pancreatic procarboxypeptidase acts on aromatic C-terminal residues and is a secreted protein. | ENSG00000158516 | carboxypeptidase A2 | 1358 | CPA2 | NA |
| NA | ENSG00000135245 | hypoxia inducible lipid droplet associated | 29923 | HILPDA | NA |
| This gene functions in the regulation of autophagy, a lysosomal degradation pathway. This gene also functions as an antisense transcript in the posttranscriptional regulation of the endothelial nitric oxide synthase 3 gene, which has 3’ overlap with this gene on the opposite strand. Mutations in this gene and disruption of the autophagy process have been associated with multiple cancers. Alternative splicing results in multiple transcript variants. | ENSG00000181652 | autophagy related 9B | 285973 | ATG9B | NA |
| NA | ENSG00000261504 | uncharacterized LOC284648 | 284648 | LOC284648 | NA |
| The protein encoded by this gene contains a HMG box DNA binding domain. HMG boxes are found in many eukaryotic proteins involved in chromatin assembly, transcription and replication. This protein may function to regulate T-cell development. | ENSG00000198846 | thymocyte selection associated high mobility group box | 9760 | TOX | NA |
| This gene encodes a member of the cysteine-aspartic acid protease (caspase) family of enzymes. Sequential activation of caspases plays a central role in the execution-phase of cell apoptosis. Caspases exist as inactive proenzymes which undergo proteolytic processing at conserved aspartic acid residues to produce two subunits, large and small, that dimerize to form the active enzyme. This protein is processed by caspases 7, 8 and 10, and is thought to function as a downstream enzyme in the caspase activation cascade. Alternative splicing of this gene results in multiple transcript variants that encode different isoforms. | ENSG00000138794 | caspase 6 | 839 | CASP6 | NA |
| This gene encodes a basic helix-loop-helix protein expressed in various tissues. The encoded protein can interact with ARNTL or compete for E-box binding sites in the promoter of PER1 and repress CLOCK/ARNTL’s transactivation of PER1. This gene is believed to be involved in the control of circadian rhythm and cell differentiation. | ENSG00000134107 | basic helix-loop-helix family member e40 | 8553 | BHLHE40 | NA |
| The protein encoded by this gene belongs to the P2X family of G-protein-coupled receptors. These proteins can form homo-and heterotimers and function as ATP-gated ion channels and mediate rapid and selective permeability to cations. This protein is primarily localized to smooth muscle where binds ATP and mediates synaptic transmission between neurons and from neurons to smooth muscle and may being responsible for sympathetic vasoconstriction in small arteries, arterioles and vas deferens. Mouse studies suggest that this receptor is essential for normal male reproductive function. This protein may also be involved in promoting apoptosis. | ENSG00000108405 | purinergic receptor P2X 1 | 5023 | P2RX1 | NA |
| NA | ENSG00000129007 | calmodulin like 4 | 91860 | CALML4 | NA |
| The protein encoded by this gene belongs to the class-3 semaphorin/collapsin family, whose members function in growth cone guidance during neuronal development. This family member inhibits axonal extension and has been shown to act as a tumor suppressor by inducing apoptosis. Alternative splicing of this gene results in multiple transcript variants. | ENSG00000012171 | semaphorin 3B | 7869 | SEMA3B | NA |
| The protein encoded by this intronless gene is an endothelial-specific type I membrane receptor that binds thrombin. This binding results in the activation of protein C, which degrades clotting factors Va and VIIIa and reduces the amount of thrombin generated. Mutations in this gene are a cause of thromboembolic disease, also known as inherited thrombophilia. | ENSG00000178726 | thrombomodulin | 7056 | THBD | NA |
| NA | ENSG00000163995 | actin binding LIM protein family member 2 | 84448 | ABLIM2 | NA |
| NA | ENSG00000164620 | RELT like 2 | 285613 | RELL2 | NA |
| Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. | ENSG00000042832 | thyroglobulin | 7038 | TG | NA |
| NA | ENSG00000198429 | zinc finger protein 69 | 7620 | ZNF69 | NA |
| The galectins are a family of beta-galactoside-binding proteins implicated in modulating cell-cell and cell-matrix interactions. Differential and in situ hybridization studies indicate that this lectin is specifically expressed in keratinocytes and found mainly in stratified squamous epithelium. A duplicate copy of this gene (GeneID:3963) is found adjacent to, but on the opposite strand on chromosome 19. | ENSG00000178934 | galectin 7B | 653499 | LGALS7B | NA |
| NA | ENSG00000259539 | NA | ENSG00000259539 | CTD-2651B20.1 | NA |
| NA | ENSG00000234043 | NA | ENSG00000234043 | RP11-56M3.1 | NA |
| This gene encodes a protein that shares sequence similarity to nucleosome assembly factors, but may be localized to the cytoplasm rather than the nucleus. Expression of this gene is downregulated in hepatocellular carcinomas. This gene is located within a differentially methylated region (DMR) and is imprinted and paternally expressed. There is a related pseudogene on chromosome 4. | ENSG00000177432 | nucleosome assembly protein 1 like 5 | 266812 | NAP1L5 | NA |
| This gene encodes a member of the stathmin family of phosphoproteins. Stathmin proteins function in microtubule dynamics and signal transduction. The encoded protein plays a regulatory role in neuronal growth and is also thought to be involved in osteogenesis. Reductions in the expression of this gene have been associated with Down’s syndrome and Alzheimer’s disease. Alternatively spliced transcript variants have been observed for this gene. A pseudogene of this gene is located on the long arm of chromosome 6. | ENSG00000104435 | stathmin 2 | 11075 | STMN2 | NA |
| This gene encodes a growth factor found in placenta which is homologous to vascular endothelial growth factor. Alternatively spliced transcripts encoding different isoforms have been found for this gene. | ENSG00000119630 | placental growth factor | 5228 | PGF | NA |
| Fructose-1,6-bisphosphate aldolase (EC 4.1.2.13) is a tetrameric glycolytic enzyme that catalyzes the reversible conversion of fructose-1,6-bisphosphate to glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. Vertebrates have 3 aldolase isozymes which are distinguished by their electrophoretic and catalytic properties. Differences indicate that aldolases A, B, and C are distinct proteins, the products of a family of related ‘housekeeping’ genes exhibiting developmentally regulated expression of the different isozymes. The developing embryo produces aldolase A, which is produced in even greater amounts in adult muscle where it can be as much as 5% of total cellular protein. In adult liver, kidney and intestine, aldolase A expression is repressed and aldolase B is produced. In brain and other nervous tissue, aldolase A and C are expressed about equally. There is a high degree of homology between aldolase A and C. Defects in ALDOB cause hereditary fructose intolerance. | ENSG00000136872 | aldolase, fructose-bisphosphate B | 229 | ALDOB | NA |
| This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | ENSG00000204983 | protease, serine 1 | 5644 | PRSS1 | NA |
| NA | ENSG00000267274 | NA | ENSG00000267274 | CTD-2006C1.12 | NA |
| This gene encodes a member of the chondroitin N-acetylgalactosaminyltransferase family. These enzymes possess dual glucuronyltransferase and galactosaminyltransferase activity and play critical roles in the biosynthesis of chondroitin sulfate, a glycosaminoglycan involved in many biological processes including cell proliferation and morphogenesis. Decreased expression of this gene may play a role in colorectal cancer, and mutations in this gene are a cause of temtamy preaxial brachydactyly syndrome. | ENSG00000131873 | chondroitin sulfate synthase 1 | 22856 | CHSY1 | NA |
| NA | ENSG00000132465 | joining chain of multimeric IgA and IgM | 3512 | JCHAIN | NA |
| NA | ENSG00000203886 | CYP17A1 antisense RNA 1 | 102724307 | CYP17A1-AS1 | NA |
| G protein-coupled receptors (GPCRs) play key roles in a variety of physiologic functions. Members of the leucine-rich GPCR (LGR) family, such as GPR48, have multiple N-terminal leucine-rich repeats (LRRs) and a 7-transmembrane domain (Weng et al., 2008 [PubMed 18424556]). | ENSG00000205213 | leucine rich repeat containing G protein-coupled receptor 4 | 55366 | LGR4 | NA |
| Zinc finger proteins, such as ZNF385A, are regulatory proteins that act as transcription factors, bind single- or double-stranded RNA, or interact with other proteins (Sharma et al., 2004 [PubMed 15527981]). | ENSG00000161642 | zinc finger protein 385A | 25946 | ZNF385A | NA |
| NA | ENSG00000240758 | NA | ENSG00000240758 | RP11-155G14.6 | NA |
| HLA-DRB1 belongs to the HLA class II beta chain paralogs. The class II molecule is a heterodimer consisting of an alpha (DRA) and a beta chain (DRB), both anchored in the membrane. It plays a central role in the immune system by presenting peptides derived from extracellular proteins. Class II molecules are expressed in antigen presenting cells (APC: B lymphocytes, dendritic cells, macrophages). The beta chain is approximately 26-28 kDa. It is encoded by 6 exons. Exon one encodes the leader peptide; exons 2 and 3 encode the two extracellular domains; exon 4 encodes the transmembrane domain; and exon 5 encodes the cytoplasmic tail. Within the DR molecule the beta chain contains all the polymorphisms specifying the peptide binding specificities. Hundreds of DRB1 alleles have been described and typing for these polymorphisms is routinely done for bone marrow and kidney transplantation. DRB1 is expressed at a level five times higher than its paralogs DRB3, DRB4 and DRB5. DRB1 is present in all individuals. Allelic variants of DRB1 are linked with either none or one of the genes DRB3, DRB4 and DRB5. There are 4 related pseudogenes: DRB2, DRB6, DRB7, DRB8 and DRB9. | ENSG00000196126 | major histocompatibility complex, class II, DR beta 1 | 3123 | HLA-DRB1 | NA |
| NA | ENSG00000196126 | HLA class II histocompatibility antigen, DRB1-7 beta chain | 105369230 | LOC105369230 | NA |
| This gene likely encodes a member of the carboxypeptidase family of proteins. Cloning of a comparable locus in mouse indicates that the encoded protein contains a discoidin domain and a carboxypeptidase domain, but the protein appears to lack residues necessary for carboxypeptidase activity. | ENSG00000088882 | carboxypeptidase X (M14 family), member 1 | 56265 | CPXM1 | NA |
| NA | ENSG00000256928 | NA | ENSG00000256928 | RP11-809N8.2 | NA |
| Protein disulfide isomerases (EC 5.3.4.1), such as PDIP, are endoplasmic reticulum (ER) resident proteins that catalyze protein folding and thiol-disulfide interchange reactions (Desilva et al., 1996 [PubMed 8561901]). | ENSG00000185615 | protein disulfide isomerase family A member 2 | 64714 | PDIA2 | NA |
| NA | ENSG00000163040 | coiled-coil domain containing 74A | 90557 | CCDC74A | NA |
| The WNT gene family consists of structurally related genes which encode secreted signaling proteins. These proteins have been implicated in oncogenesis and in several developmental processes, including regulation of cell fate and patterning during embryogenesis. This gene is a member of the WNT gene family. It encodes a protein which shows 97%, 85%, and 63% amino acid identity with mouse, chicken, and Xenopus Wnt11 protein, respectively. This gene may play roles in the development of skeleton, kidney and lung, and is considered to be a plausible candidate gene for High Bone Mass Syndrome. | ENSG00000085741 | Wnt family member 11 | 7481 | WNT11 | NA |
| This gene encodes a member of the regulators of G protein signaling (RGS) family. The RGS proteins are signal transduction molecules which are involved in the regulation of heterotrimeric G proteins by acting as GTPase activators. This gene is a hypoxia-inducible factor-1 dependent, hypoxia-induced gene which is involved in the induction of endothelial apoptosis. This gene is also one of three genes on chromosome 1q contributing to elevated blood pressure. Alternatively spliced transcript variants have been identified. | ENSG00000143248 | regulator of G-protein signaling 5 | 8490 | RGS5 | NA |
| NA | ENSG00000270172 | NA | NA | NA | TRUE |
| Defensins are a family of antimicrobial and cytotoxic peptides thought to be involved in host defense. They are abundant in the granules of neutrophils and also found in the epithelia of mucosal surfaces such as those of the intestine, respiratory tract, urinary tract, and vagina. Members of the defensin family are highly similar in protein sequence and distinguished by a conserved cysteine motif. Several of the alpha defensin genes appear to be clustered on chromosome 8. The protein encoded by this gene, defensin, alpha 5, is highly expressed in the secretory granules of Paneth cells of the ileum. | ENSG00000164816 | defensin alpha 5 | 1670 | DEFA5 | NA |
| This gene encodes neutrophil cytosolic factor 2, the 67-kilodalton cytosolic subunit of the multi-protein NADPH oxidase complex found in neutrophils. This oxidase produces a burst of superoxide which is delivered to the lumen of the neutrophil phagosome. Mutations in this gene, as well as in other NADPH oxidase subunits, can result in chronic granulomatous disease, a disease that causes recurrent infections by catalase-positive organisms. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000116701 | neutrophil cytosolic factor 2 | 4688 | NCF2 | NA |
| LY6G6C belongs to a cluster of leukocyte antigen-6 (LY6) genes located in the major histocompatibility complex (MHC) class III region on chromosome 6. Members of the LY6 superfamily typically contain 70 to 80 amino acids, including 8 to 10 cysteines. Most LY6 proteins are attached to the cell surface by a glycosylphosphatidylinositol (GPI) anchor that is directly involved in signal transduction (Mallya et al., 2002 [PubMed 12079290]). | ENSG00000204421 | lymphocyte antigen 6 complex, locus G6C | 80740 | LY6G6C | NA |
| NA | ENSG00000249007 | NA | ENSG00000249007 | RP11-510N19.5 | NA |
| Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3B has little elastolytic activity. Like most of the human elastases, elastase 3B is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3B preferentially cleaves proteins after alanine residues. Elastase 3B may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1, and excretion of this protein in fecal material is frequently used as a measure of pancreatic function in clinical assays. | ENSG00000219073 | chymotrypsin like elastase family member 3B | 23436 | CELA3B | NA |
| TMEM97 is a conserved integral membrane protein that plays a role in controlling cellular cholesterol levels (Bartz et al., 2009 [PubMed 19583955]). | ENSG00000109084 | transmembrane protein 97 | 27346 | TMEM97 | NA |
| This gene encodes a type I transmembrane protein that is localized to junctional complexes between endothelial and epithelial cells and may have a role in cell-cell adhesion. Expression of this gene in white adipose tissue is implicated in adipocyte maturation and development of obesity. This gene is also essential for normal intestinal development and mutations in the gene are associated with congenital short bowel syndrome. | ENSG00000166250 | CXADR-like membrane protein | 79827 | CLMP | NA |
| NA | ENSG00000236047 | NA | ENSG00000236047 | AC073410.1 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",6,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[7,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| query | X_id | name | summary | symbol | notfound |
|---|---|---|---|---|---|
| ENSG00000171476 | 84525 | HOP homeobox | The protein encoded by this gene is a homeodomain protein that lacks certain conserved residues required for DNA binding. It was reported that choriocarcinoma cell lines and tissues failed to express this gene, which suggested the possible involvement of this gene in malignant conversion of placental trophoblasts. Studies in mice suggest that this protein may interact with serum response factor (SRF) and modulate SRF-dependent cardiac-specific gene expression and cardiac development. Multiple alternatively spliced transcript variants have been identified for this gene. | HOPX | NA |
| ENSG00000079215 | 6507 | solute carrier family 1 member 3 | This gene encodes a member of a member of a high affinity glutamate transporter family. This gene functions in the termination of excitatory neurotransmission in central nervous system. Mutations are associated with episodic ataxia, Type 6. Alternative splicing results in multiple transcript variants. | SLC1A3 | NA |
| ENSG00000234964 | ENSG00000234964 | fatty acid binding protein 5 pseudogene 7 | NA | FABP5P7 | NA |
| ENSG00000117115 | 11240 | peptidyl arginine deiminase 2 | This gene encodes a member of the peptidyl arginine deiminase family of enzymes, which catalyze the post-translational deimination of proteins by converting arginine residues into citrullines in the presence of calcium ions. The family members have distinct substrate specificities and tissue-specific expression patterns. The type II enzyme is the most widely expressed family member. Known substrates for this enzyme include myelin basic protein in the central nervous system and vimentin in skeletal muscle and macrophages. This enzyme is thought to play a role in the onset and progression of neurodegenerative human disorders, including Alzheimer disease and multiple sclerosis, and it has also been implicated in glaucoma pathogenesis. This gene exists in a cluster with four other paralogous genes. | PADI2 | NA |
| ENSG00000161682 | 284069 | family with sequence similarity 171 member A2 | NA | FAM171A2 | NA |
| ENSG00000228314 | 54055 | cytochrome P450 family 4 subfamily F member 29, pseudogene | NA | CYP4F29P | NA |
| ENSG00000185615 | 64714 | protein disulfide isomerase family A member 2 | Protein disulfide isomerases (EC 5.3.4.1), such as PDIP, are endoplasmic reticulum (ER) resident proteins that catalyze protein folding and thiol-disulfide interchange reactions (Desilva et al., 1996 [PubMed 8561901]). | PDIA2 | NA |
| ENSG00000108405 | 5023 | purinergic receptor P2X 1 | The protein encoded by this gene belongs to the P2X family of G-protein-coupled receptors. These proteins can form homo-and heterotimers and function as ATP-gated ion channels and mediate rapid and selective permeability to cations. This protein is primarily localized to smooth muscle where binds ATP and mediates synaptic transmission between neurons and from neurons to smooth muscle and may being responsible for sympathetic vasoconstriction in small arteries, arterioles and vas deferens. Mouse studies suggest that this receptor is essential for normal male reproductive function. This protein may also be involved in promoting apoptosis. | P2RX1 | NA |
| ENSG00000172023 | 5968 | regenerating family member 1 beta | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV based on the primary structures of the encoded proteins. This gene encodes a protein secreted by the exocrine pancreas that is highly similar to the REG1A protein. The related REG1A protein is associated with islet cell regeneration and diabetogenesis, and may be involved in pancreatic lithogenesis. Reg family members REG1A, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | REG1B | NA |
| ENSG00000110852 | 9976 | C-type lectin domain family 2 member B | This gene encodes a member of the C-type lectin/C-type lectin-like domain (CTL/CTLD) superfamily. Members of this family share a common protein fold and have diverse functions, such as cell adhesion, cell-cell signalling, glycoprotein turnover, and roles in inflammation and immune response. The encoded type 2 transmembrane protein may function as a cell activation antigen. An alternative splice variant has been described but its full-length sequence has not been determined. This gene is closely linked to other CTL/CTLD superfamily members on chromosome 12p13 in the natural killer gene complex region. | CLEC2B | NA |
| ENSG00000115386 | 5967 | regenerating family member 1 alpha | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | REG1A | NA |
| ENSG00000171444 | 4163 | mutated in colorectal cancers | This gene is a candidate colorectal tumor suppressor gene that is thought to negatively regulate cell cycle progression. The orthologous gene in the mouse expresses a phosphoprotein associated with the plasma membrane and membrane organelles, and overexpression of the mouse protein inhibits entry into S phase. Multiple transcript variants encoding different isoforms have been found for this gene. | MCC | NA |
| ENSG00000172016 | 5068 | regenerating family member 3 alpha | This gene encodes a pancreatic secretory protein that may be involved in cell proliferation or differentiation. It has similarity to the C-type lectin superfamily. The enhanced expression of this gene is observed during pancreatic inflammation and liver carcinogenesis. The mature protein also functions as an antimicrobial protein with antibacterial activity. Alternate splicing results in multiple transcript variants that encode the same protein. | REG3A | NA |
| ENSG00000204983 | 5644 | protease, serine 1 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | PRSS1 | NA |
| ENSG00000088882 | 56265 | carboxypeptidase X (M14 family), member 1 | This gene likely encodes a member of the carboxypeptidase family of proteins. Cloning of a comparable locus in mouse indicates that the encoded protein contains a discoidin domain and a carboxypeptidase domain, but the protein appears to lack residues necessary for carboxypeptidase activity. | CPXM1 | NA |
| ENSG00000079385 | 634 | carcinoembryonic antigen related cell adhesion molecule 1 | This gene encodes a member of the carcinoembryonic antigen (CEA) gene family, which belongs to the immunoglobulin superfamily. Two subgroups of the CEA family, the CEA cell adhesion molecules and the pregnancy-specific glycoproteins, are located within a 1.2 Mb cluster on the long arm of chromosome 19. Eleven pseudogenes of the CEA cell adhesion molecule subgroup are also found in the cluster. The encoded protein was originally described in bile ducts of liver as biliary glycoprotein. Subsequently, it was found to be a cell-cell adhesion molecule detected on leukocytes, epithelia, and endothelia. The encoded protein mediates cell adhesion via homophilic as well as heterophilic binding to other proteins of the subgroup. Multiple cellular activities have been attributed to the encoded protein, including roles in the differentiation and arrangement of tissue three-dimensional structure, angiogenesis, apoptosis, tumor suppression, metastasis, and the modulation of innate and adaptive immune responses. Multiple transcript variants encoding different isoforms have been reported, but the full-length nature of all variants has not been defined. | CEACAM1 | NA |
| ENSG00000137857 | 53905 | dual oxidase 1 | The protein encoded by this gene is a glycoprotein and a member of the NADPH oxidase family. The synthesis of thyroid hormone is catalyzed by a protein complex located at the apical membrane of thyroid follicular cells. This complex contains an iodide transporter, thyroperoxidase, and a peroxide generating system that includes proteins encoded by this gene and the similar DUOX2 gene. This protein is known as dual oxidase because it has both a peroxidase homology domain and a gp91phox domain. This protein generates hydrogen peroxide and thereby plays a role in the activity of thyroid peroxidase, lactoperoxidase, and in lactoperoxidase-mediated antimicrobial defense at mucosal surfaces. Two alternatively spliced transcript variants encoding the same protein have been described for this gene. | DUOX1 | NA |
| ENSG00000219073 | 23436 | chymotrypsin like elastase family member 3B | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3B has little elastolytic activity. Like most of the human elastases, elastase 3B is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3B preferentially cleaves proteins after alanine residues. Elastase 3B may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1, and excretion of this protein in fecal material is frequently used as a measure of pancreatic function in clinical assays. | CELA3B | NA |
| ENSG00000140459 | 1583 | cytochrome P450 family 11 subfamily A member 1 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the mitochondrial inner membrane and catalyzes the conversion of cholesterol to pregnenolone, the first and rate-limiting step in the synthesis of the steroid hormones. Two transcript variants encoding different isoforms have been found for this gene. The cellular location of the smaller isoform is unclear since it lacks the mitochondrial-targeting transit peptide. | CYP11A1 | NA |
| ENSG00000162482 | 22977 | aldo-keto reductase family 7 member A3 | Aldo-keto reductases, such as AKR7A3, are involved in the detoxification of aldehydes and ketones. | AKR7A3 | NA |
| ENSG00000143320 | 1382 | cellular retinoic acid binding protein 2 | This gene encodes a member of the retinoic acid (RA, a form of vitamin A) binding protein family and lipocalin/cytosolic fatty-acid binding protein family. The protein is a cytosol-to-nuclear shuttling protein, which facilitates RA binding to its cognate receptor complex and transfer to the nucleus. It is involved in the retinoid signaling pathway, and is associated with increased circulating low-density lipoprotein cholesterol. Alternatively spliced transcript variants encoding the same protein have been found for this gene. | CRABP2 | NA |
| ENSG00000272275 | ENSG00000272275 | NA | NA | RP11-791G15.2 | NA |
| ENSG00000268995 | ENSG00000268995 | vomeronasal 1 receptor 82 pseudogene | NA | VN1R82P | NA |
| ENSG00000139540 | 283375 | solute carrier family 39 member 5 | The protein encoded by this gene belongs to the ZIP family of zinc transporters that transport zinc into cells from outside, and play a crucial role in controlling intracellular zinc levels. Zinc is an essential cofactor for many enzymes and proteins involved in gene transcription, growth, development and differentiation. Mutations in this gene have been associated with autosomal dominant high myopia (MYP24). Alternatively spliced transcript variants have been found for this gene. | SLC39A5 | NA |
| ENSG00000169347 | 2813 | glycoprotein 2 | This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. | GP2 | NA |
| ENSG00000128594 | 64101 | leucine rich repeat containing 4 | This gene is significantly downregulated in primary brain tumors. The exact function of the protein encoded by this gene is unknown. | LRRC4 | NA |
| ENSG00000082397 | 23136 | erythrocyte membrane protein band 4.1 like 3 | NA | EPB41L3 | NA |
| ENSG00000118271 | 7276 | transthyretin | This gene encodes transthyretin, one of the three prealbumins including alpha-1-antitrypsin, transthyretin and orosomucoid. Transthyretin is a carrier protein; it transports thyroid hormones in the plasma and cerebrospinal fluid, and also transports retinol (vitamin A) in the plasma. The protein consists of a tetramer of identical subunits. More than 80 different mutations in this gene have been reported; most mutations are related to amyloid deposition, affecting predominantly peripheral nerve and/or the heart, and a small portion of the gene mutations is non-amyloidogenic. The diseases caused by mutations include amyloidotic polyneuropathy, euthyroid hyperthyroxinaemia, amyloidotic vitreous opacities, cardiomyopathy, oculoleptomeningeal amyloidosis, meningocerebrovascular amyloidosis, carpal tunnel syndrome, etc. | TTR | NA |
| ENSG00000162551 | 249 | alkaline phosphatase, liver/bone/kidney | This gene encodes a member of the alkaline phosphatase family of proteins. There are at least four distinct but related alkaline phosphatases: intestinal, placental, placental-like, and liver/bone/kidney (tissue non-specific). The first three are located together on chromosome 2, while the tissue non-specific form is located on chromosome 1. The product of this gene is a membrane bound glycosylated enzyme that is not expressed in any particular tissue and is, therefore, referred to as the tissue-nonspecific form of the enzyme. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature enzyme. This enzyme may play a role in bone mineralization. Mutations in this gene have been linked to hypophosphatasia, a disorder that is characterized by hypercalcemia and skeletal defects. | ALPL | NA |
| ENSG00000189184 | 54510 | protocadherin 18 | This gene belongs to the protocadherin gene family, a subfamily of the cadherin superfamily. This gene encodes a protein which contains 6 extracellular cadherin domains, a transmembrane domain and a cytoplasmic tail differing from those of the classical cadherins. Although its specific function is undetermined, the cadherin-related neuronal receptor is thought to play a role in the establishment and function of specific cell-cell connections in the brain. | PCDH18 | NA |
| ENSG00000129467 | 196883 | adenylate cyclase 4 | This gene encodes a member of the family of adenylate cyclases, which are membrane-associated enzymes that catalyze the formation of the secondary messenger cyclic adenosine monophosphate (cAMP). Mouse studies show that adenylate cyclase 4, along with adenylate cyclases 2 and 3, is expressed in olfactory cilia, suggesting that several different adenylate cyclases may couple to olfactory receptors and that there may be multiple receptor-mediated mechanisms for the generation of cAMP signals. Alternative splicing results in transcript variants. | ADCY4 | NA |
| ENSG00000142789 | 10136 | chymotrypsin like elastase family member 3A | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. | CELA3A | NA |
| ENSG00000089041 | 5027 | purinergic receptor P2X 7 | The product of this gene belongs to the family of purinoceptors for ATP. This receptor functions as a ligand-gated ion channel and is responsible for ATP-dependent lysis of macrophages through the formation of membrane pores permeable to large molecules. Activation of this nuclear receptor by ATP in the cytoplasm may be a mechanism by which cellular activity can be coupled to changes in gene expression. Multiple alternatively spliced variants have been identified, most of which fit nonsense-mediated decay (NMD) criteria. | P2RX7 | NA |
| ENSG00000171345 | 3880 | keratin 19 | The protein encoded by this gene is a member of the keratin family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. The type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains. Unlike its related family members, this smallest known acidic cytokeratin is not paired with a basic cytokeratin in epithelial cells. It is specifically expressed in the periderm, the transiently superficial layer that envelopes the developing epidermis. The type I cytokeratins are clustered in a region of chromosome 17q12-q21. | KRT19 | NA |
| ENSG00000110245 | 345 | apolipoprotein C3 | Apolipoprotein C-III is a very low density lipoprotein (VLDL) protein. APOC3 inhibits lipoprotein lipase and hepatic lipase; it is thought to delay catabolism of triglyceride-rich particles. The APOA1, APOC3 and APOA4 genes are closely linked in both rat and human genomes. The A-I and A-IV genes are transcribed from the same strand, while the A-1 and C-III genes are convergently transcribed. An increase in apoC-III levels induces the development of hypertriglyceridemia. | APOC3 | NA |
| ENSG00000157107 | 115548 | FCH domain only 2 | NA | FCHO2 | NA |
| ENSG00000121680 | 9409 | peroxisomal biogenesis factor 16 | The protein encoded by this gene is an integral peroxisomal membrane protein. An inactivating nonsense mutation localized to this gene was observed in a patient with Zellweger syndrome of the complementation group CGD/CG9. Expression of this gene product morphologically and biochemically restores the formation of new peroxisomes, suggesting a role in peroxisome organization and biogenesis. Alternative splicing has been observed for this gene and two variants have been described. | PEX16 | NA |
| ENSG00000185559 | 8788 | delta like non-canonical Notch ligand 1 | This gene encodes a transmembrane protein that contains multiple epidermal growth factor repeats that functions as a regulator of cell growth. The encoded protein is involved in the differentiation of several cell types including adipocytes. This gene is located in a region of chromosome 14 frequently showing unparental disomy, and is imprinted and expressed from the paternal allele. A single nucleotide variant in this gene is associated with child and adolescent obesity and shows polar overdominance, where heterozygotes carrying an active paternal allele express the phenotype, while mutant homozygotes are normal. | DLK1 | NA |
| ENSG00000132329 | 10267 | receptor activity modifying protein 1 | The protein encoded by this gene is a member of the RAMP family of single-transmembrane-domain proteins, called receptor (calcitonin) activity modifying proteins (RAMPs). RAMPs are type I transmembrane proteins with an extracellular N terminus and a cytoplasmic C terminus. RAMPs are required to transport calcitonin-receptor-like receptor (CRLR) to the plasma membrane. CRLR, a receptor with seven transmembrane domains, can function as either a calcitonin-gene-related peptide (CGRP) receptor or an adrenomedullin receptor, depending on which members of the RAMP family are expressed. In the presence of this (RAMP1) protein, CRLR functions as a CGRP receptor. The RAMP1 protein is involved in the terminal glycosylation, maturation, and presentation of the CGRP receptor to the cell surface. Alternative splicing results in multiple transcript variants encoding different isoforms. | RAMP1 | NA |
| ENSG00000259185 | ENSG00000259185 | NA | NA | RP11-56B16.4 | NA |
| ENSG00000171033 | 5569 | protein kinase (cAMP-dependent, catalytic) inhibitor alpha | The protein encoded by this gene is a member of the cAMP-dependent protein kinase (PKA) inhibitor family. This protein was demonstrated to interact with and inhibit the activities of both C alpha and C beta catalytic subunits of the PKA. Alternatively spliced transcript variants encoding the same protein have been reported. | PKIA | NA |
| ENSG00000254198 | ENSG00000254198 | NA | NA | RP11-598P20.3 | NA |
| ENSG00000196263 | 57573 | zinc finger protein 471 | NA | ZNF471 | NA |
| ENSG00000063127 | 28968 | solute carrier family 6 member 16 | SLC6A16 shows structural characteristics of an Na(+)- and Cl(-)-dependent neurotransmitter transporter, including 12 transmembrane (TM) domains, intracellular N and C termini, and large extracellular loops containing multiple N-glycosylation sites. | SLC6A16 | NA |
| ENSG00000162882 | 23498 | 3-hydroxyanthranilate 3,4-dioxygenase | 3-Hydroxyanthranilate 3,4-dioxygenase is a monomeric cytosolic protein belonging to the family of intramolecular dioxygenases containing nonheme ferrous iron. It is widely distributed in peripheral organs, such as liver and kidney, and is also present in low amounts in the central nervous system. HAAO catalyzes the synthesis of quinolinic acid (QUIN) from 3-hydroxyanthranilic acid. QUIN is an excitotoxin whose toxicity is mediated by its ability to activate glutamate N-methyl-D-aspartate receptors. Increased cerebral levels of QUIN may participate in the pathogenesis of neurologic and inflammatory disorders. HAAO has been suggested to play a role in disorders associated with altered tissue levels of QUIN. | HAAO | NA |
| ENSG00000175445 | 4023 | lipoprotein lipase | LPL encodes lipoprotein lipase, which is expressed in heart, muscle, and adipose tissue. LPL functions as a homodimer, and has the dual functions of triglyceride hydrolase and ligand/bridging factor for receptor-mediated lipoprotein uptake. Severe mutations that cause LPL deficiency result in type I hyperlipoproteinemia, while less extreme mutations in LPL are linked to many disorders of lipoprotein metabolism. | LPL | NA |
| ENSG00000117477 | 57821 | coiled-coil domain containing 181 | NA | CCDC181 | NA |
| ENSG00000104723 | 7991 | tumor suppressor candidate 3 | This gene is a candidate tumor suppressor gene. It is located within a homozygously deleted region of a metastatic prostate cancer. The gene is expressed in most nonlymphoid human tissues including prostate, lung, liver, and colon. Expression was also detected in many epithelial tumor cell lines. Two transcript variants encoding distinct isoforms have been identified for this gene. | TUSC3 | NA |
| ENSG00000118898 | 5493 | periplakin | The protein encoded by this gene is a component of desmosomes and of the epidermal cornified envelope in keratinocytes. The N-terminal domain of this protein interacts with the plasma membrane and its C-terminus interacts with intermediate filaments. Through its rod domain, this protein forms complexes with envoplakin. This protein may serve as a link between the cornified envelope and desmosomes as well as intermediate filaments. AKT1/PKB, a protein kinase mediating a variety of cell growth and survival signaling processes, is reported to interact with this protein, suggesting a possible role for this protein as a localization signal in AKT1-mediated signaling. | PPL | NA |
| ENSG00000175535 | 5406 | pancreatic lipase | This gene is a member of the lipase gene family. It encodes a carboxyl esterase that hydrolyzes insoluble, emulsified triglycerides, and is essential for the efficient digestion of dietary fats. This gene is expressed specifically in the pancreas. | PNLIP | NA |
| ENSG00000165029 | 19 | ATP binding cassette subfamily A member 1 | The membrane-associated protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intracellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the ABC1 subfamily. Members of the ABC1 subfamily comprise the only major ABC subfamily found exclusively in multicellular eukaryotes. With cholesterol as its substrate, this protein functions as a cholesteral efflux pump in the cellular lipid removal pathway. Mutations in this gene have been associated with Tangier’s disease and familial high-density lipoprotein deficiency. | ABCA1 | NA |
| ENSG00000232573 | ENSG00000232573 | ribosomal protein L3 pseudogene 4 | NA | RPL3P4 | NA |
| ENSG00000149806 | 2197 | Finkel-Biskis-Reilly murine sarcoma virus (FBR-MuSV) ubiquitously expressed | This gene is the cellular homolog of the fox sequence in the Finkel-Biskis-Reilly murine sarcoma virus (FBR-MuSV). It encodes a fusion protein consisting of the ubiquitin-like protein fubi at the N terminus and ribosomal protein S30 at the C terminus. It has been proposed that the fusion protein is post-translationally processed to generate free fubi and free ribosomal protein S30. Fubi is a member of the ubiquitin family, and ribosomal protein S30 belongs to the S30E family of ribosomal proteins. Whereas the function of fubi is currently unknown, ribosomal protein S30 is a component of the 40S subunit of the cytoplasmic ribosome and displays antimicrobial activity. Pseudogenes derived from this gene are present in the genome. Similar to ribosomal protein S30, ribosomal proteins S27a and L40 are synthesized as fusion proteins with ubiquitin. | FAU | NA |
| ENSG00000132622 | 116835 | heat shock protein family A (Hsp70) member 12B | The protein encoded by this gene contains an atypical heat shock protein 70 (Hsp70) ATPase domain and is therefore a distant member of the mammalian Hsp70 family. This gene may be involved in susceptibility to atherosclerosis. Alternative splicing results in multiple transcript variants encoding different isoforms. | HSPA12B | NA |
| ENSG00000258177 | ENSG00000258177 | NA | NA | RP11-394J1.2 | NA |
| ENSG00000139926 | 122786 | FERM domain containing 6 | NA | FRMD6 | NA |
| ENSG00000105427 | 84518 | cornifelin | NA | CNFN | NA |
| ENSG00000184990 | 10572 | SIVA1 apoptosis inducing factor | This gene encodes a protein with an important role in the apoptotic (programmed cell death) pathway induced by the CD27 antigen, a member of the tumor necrosis factor receptor (TFNR) superfamily. The CD27 antigen cytoplasmic tail binds to the N-terminus of this protein. Two alternatively spliced transcript variants encoding distinct proteins have been described. | SIVA1 | NA |
| ENSG00000169245 | 3627 | C-X-C motif chemokine ligand 10 | This antimicrobial gene encodes a chemokine of the CXC subfamily and ligand for the receptor CXCR3. Binding of this protein to CXCR3 results in pleiotropic effects, including stimulation of monocytes, natural killer and T-cell migration, and modulation of adhesion molecule expression. | CXCL10 | NA |
| ENSG00000006451 | 5898 | RALA Ras like proto-oncogene A | The product of this gene belongs to the small GTPase superfamily, Ras family of proteins. GTP-binding proteins mediate the transmembrane signaling initiated by the occupancy of certain cell surface receptors. This gene encodes a low molecular mass ras-like GTP-binding protein that shares about 50% similarity with other ras proteins. | RALA | NA |
| ENSG00000251442 | ENSG00000251442 | long intergenic non-protein coding RNA 1094 | NA | LINC01094 | NA |
| ENSG00000104177 | 50804 | myelin expression factor 2 | NA | MYEF2 | NA |
| ENSG00000145824 | 9547 | C-X-C motif chemokine ligand 14 | This antimicrobial gene belongs to the cytokine gene family which encode secreted proteins involved in immunoregulatory and inflammatory processes. The protein encoded by this gene is structurally related to the CXC (Cys-X-Cys) subfamily of cytokines. Members of this subfamily are characterized by two cysteines separated by a single amino acid. This cytokine displays chemotactic activity for monocytes but not for lymphocytes, dendritic cells, neutrophils or macrophages. It has been implicated that this cytokine is involved in the homeostasis of monocyte-derived macrophages rather than in inflammation. | CXCL14 | NA |
| ENSG00000169860 | 5028 | purinergic receptor P2Y1 | The product of this gene belongs to the family of G-protein coupled receptors. This family has several receptor subtypes with different pharmacological selectivity, which overlaps in some cases, for various adenosine and uridine nucleotides. This receptor functions as a receptor for extracellular ATP and ADP. In platelets binding to ADP leads to mobilization of intracellular calcium ions via activation of phospholipase C, a change in platelet shape, and probably to platelet aggregation. | P2RY1 | NA |
| ENSG00000108950 | 54757 | family with sequence similarity 20 member A | This locus encodes a protein that is likely secreted and may function in hematopoiesis. A mutation at this locus has been associated with amelogenesis imperfecta and gingival hyperplasia syndrome. Alternatively spliced transcript variants have been identified. | FAM20A | NA |
| ENSG00000158270 | 81035 | collectin subfamily member 12 | This gene encodes a member of the C-lectin family, proteins that possess collagen-like sequences and carbohydrate recognition domains. This protein is a scavenger receptor, a cell surface glycoprotein that displays several functions associated with host defense. It can bind to carbohydrate antigens on microorganisms, facilitating their recognition and removal. It also mediates the recognition, internalization, and degradation of oxidatively modified low density lipoprotein by vascular endothelial cells. | COLEC12 | NA |
| ENSG00000114854 | 7134 | troponin C1, slow skeletal and cardiac type | Troponin is a central regulatory protein of striated muscle contraction, and together with tropomyosin, is located on the actin filament. Troponin consists of 3 subunits: TnI, which is the inhibitor of actomyosin ATPase; TnT, which contains the binding site for tropomyosin; and TnC, the protein encoded by this gene. The binding of calcium to TnC abolishes the inhibitory action of TnI, thus allowing the interaction of actin with myosin, the hydrolysis of ATP, and the generation of tension. Mutations in this gene are associated with cardiomyopathy dilated type 1Z. | TNNC1 | NA |
| ENSG00000214050 | 157574 | F-box protein 16 | This gene encodes a member of the F-box protein family, members of which are characterized by an approximately 40 amino acid motif, the F-box. The F-box proteins constitute one of the four subunits of ubiquitin protein ligase complex called SCFs (SKP1-cullin-F-box), which function in phosphorylation-dependent ubiquitination. The F-box proteins are divided into three classes: Fbws containing WD-40 domains, Fbls containing leucine-rich repeats, and Fbxs containing either different protein-protein interaction modules or no recognizable motifs. The protein encoded by this gene belongs to the Fbx class. Multiple transcript variants encoding different isoforms have been found for this gene. | FBXO16 | NA |
| ENSG00000132688 | 10763 | nestin | This gene encodes a member of the intermediate filament protein family and is expressed primarily in nerve cells. | NES | NA |
| ENSG00000075275 | 9620 | cadherin EGF LAG seven-pass G-type receptor 1 | The protein encoded by this gene is a member of the flamingo subfamily, part of the cadherin superfamily. The flamingo subfamily consists of nonclassic-type cadherins; a subpopulation that does not interact with catenins. The flamingo cadherins are located at the plasma membrane and have nine cadherin domains, seven epidermal growth factor-like repeats and two laminin A G-type repeats in their ectodomain. They also have seven transmembrane domains, a characteristic unique to this subfamily. It is postulated that these proteins are receptors involved in contact-mediated communication, with cadherin domains acting as homophilic binding regions and the EGF-like domains involved in cell adhesion and receptor-ligand interactions. This particular member is a developmentally regulated, neural-specific gene which plays an unspecified role in early embryogenesis. | CELSR1 | NA |
| ENSG00000116106 | 2043 | EPH receptor A4 | This gene belongs to the ephrin receptor subfamily of the protein-tyrosine kinase family. EPH and EPH-related receptors have been implicated in mediating developmental events, particularly in the nervous system. Receptors in the EPH subfamily typically have a single kinase domain and an extracellular region containing a Cys-rich domain and 2 fibronectin type III repeats. The ephrin receptors are divided into 2 groups based on the similarity of their extracellular domain sequences and their affinities for binding ephrin-A and ephrin-B ligands. Multiple transcript variants encoding different isoforms have been found for this gene. | EPHA4 | NA |
| ENSG00000114019 | 51421 | angiomotin like 2 | Angiomotin is a protein that binds angiostatin, a circulating inhibitor of the formation of new blood vessels (angiogenesis). Angiomotin mediates angiostatin inhibition of endothelial cell migration and tube formation in vitro. The protein encoded by this gene is related to angiomotin and is a member of the motin protein family. Alternative splicing results in multiple transcript variants of this gene. | AMOTL2 | NA |
| ENSG00000114770 | 10057 | ATP binding cassette subfamily C member 5 | The protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intra-cellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the MRP subfamily which is involved in multi-drug resistance. This protein functions in the cellular export of its substrate, cyclic nucleotides. This export contributes to the degradation of phosphodiesterases and possibly an elimination pathway for cyclic nucleotides. Studies show that this protein provides resistance to thiopurine anticancer drugs, 6-mercatopurine and thioguanine, and the anti-HIV drug 9-(2-phosphonylmethoxyethyl)adenine. This protein may be involved in resistance to thiopurines in acute lymphoblastic leukemia and antiretroviral nucleoside analogs in HIV-infected patients. Alternative splicing results in multiple transcript variants. | ABCC5 | NA |
| ENSG00000163141 | 149428 | BCL2/adenovirus E1B 19kD interacting protein like | The protein encoded by this gene interacts with several other proteins, such as BCL2, ARHGAP1, MIF and GFER. It may function as a bridge molecule between BCL2 and ARHGAP1/CDC42 in promoting cell death. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | BNIPL | NA |
| ENSG00000088002 | 6820 | sulfotransferase family 2B member 1 | Sulfotransferase enzymes catalyze the sulfate conjugation of many hormones, neurotransmitters, drugs, and xenobiotic compounds. These cytosolic enzymes are different in their tissue distributions and substrate specificities. The gene structure (number and length of exons) is similar among family members. This gene sulfates dehydroepiandrosterone but not 4-nitrophenol, a typical substrate for the phenol and estrogen sulfotransferase subfamilies. Two alternatively spliced variants that encode different isoforms have been described. | SULT2B1 | NA |
| ENSG00000250404 | NA | NA | NA | NA | TRUE |
| ENSG00000182333 | 8513 | lipase F, gastric type | This gene encodes gastric lipase, an enzyme involved in the digestion of dietary triglycerides in the gastrointestinal tract, and responsible for 30% of fat digestion processes occurring in human. It is secreted by gastric chief cells in the fundic mucosa of the stomach, and it hydrolyzes the ester bonds of triglycerides under acidic pH conditions. The gene is a member of a conserved gene family of lipases that play distinct roles in neutral lipid metabolism. Several transcript variants encoding different isoforms have been found for this gene. | LIPF | NA |
| ENSG00000166923 | 26585 | gremlin 1, DAN family BMP antagonist | This gene encodes a member of the BMP (bone morphogenic protein) antagonist family. Like BMPs, BMP antagonists contain cystine knots and typically form homo- and heterodimers. The CAN (cerberus and dan) subfamily of BMP antagonists, to which this gene belongs, is characterized by a C-terminal cystine knot with an eight-membered ring. The antagonistic effect of the secreted glycosylated protein encoded by this gene is likely due to its direct binding to BMP proteins. As an antagonist of BMP, this gene may play a role in regulating organogenesis, body patterning, and tissue differentiation. In mouse, this protein has been shown to relay the sonic hedgehog (SHH) signal from the polarizing region to the apical ectodermal ridge during limb bud outgrowth. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | GREM1 | NA |
| ENSG00000176834 | 54621 | V-set and immunoglobulin domain containing 10 | NA | VSIG10 | NA |
| ENSG00000229874 | NA | NA | NA | NA | TRUE |
| ENSG00000257831 | ENSG00000257831 | NA | NA | RP11-596D21.1 | NA |
| ENSG00000112964 | 2690 | growth hormone receptor | This gene encodes a member of the type I cytokine receptor family, which is a transmembrane receptor for growth hormone. Binding of growth hormone to the receptor leads to receptor dimerization and the activation of an intra- and intercellular signal transduction pathway leading to growth. Mutations in this gene have been associated with Laron syndrome, also known as the growth hormone insensitivity syndrome (GHIS), a disorder characterized by short stature. In humans and rabbits, but not rodents, growth hormone binding protein (GHBP) is generated by proteolytic cleavage of the extracellular ligand-binding domain from the mature growth hormone receptor protein. Multiple alternatively spliced transcript variants have been found for this gene. | GHR | NA |
| ENSG00000250606 | NA | NA | NA | NA | TRUE |
| ENSG00000128512 | 9732 | dedicator of cytokinesis 4 | This gene is a member of the dedicator of cytokinesis (DOCK) family and encodes a protein with a DHR-1 (CZH-1) domain, a DHR-2 (CZH-2) domain and an SH3 domain. This membrane-associated, cytoplasmic protein functions as a guanine nucleotide exchange factor and is involved in regulation of adherens junctions between cells. Mutations in this gene have been associated with ovarian, prostate, glioma, and colorectal cancers. Alternatively spliced variants which encode different protein isoforms have been described, but only one has been fully characterized. | DOCK4 | NA |
| ENSG00000133710 | 11005 | serine peptidase inhibitor, Kazal type 5 | This gene encodes a multidomain serine protease inhibitor that contains 15 potential inhibitory domains. The encoded preproprotein is proteolytically processed to generate multiple protein products, which may exhibit unique activities and specificities. These proteins may play a role in skin and hair morphogenesis, as well as anti-inflammatory and antimicrobial protection of mucous epithelia. Mutations in this gene may result in Netherton syndrome, a disorder characterized by ichthyosis, defective cornification, and atopy. This gene is present in a gene cluster on chromosome 5. Alternative splicing results in multiple transcript variants. | SPINK5 | NA |
| ENSG00000139832 | 55647 | RAB20, member RAS oncogene family | NA | RAB20 | NA |
| ENSG00000169554 | 9839 | zinc finger E-box binding homeobox 2 | The protein encoded by this gene is a member of the Zfh1 family of 2-handed zinc finger/homeodomain proteins. It is located in the nucleus and functions as a DNA-binding transcriptional repressor that interacts with activated SMADs. Mutations in this gene are associated with Hirschsprung disease/Mowat-Wilson syndrome. Alternatively spliced transcript variants have been found for this gene. | ZEB2 | NA |
| ENSG00000168484 | 6440 | surfactant protein C | This gene encodes the pulmonary-associated surfactant protein C (SPC), an extremely hydrophobic surfactant protein essential for lung function and homeostasis after birth. Pulmonary surfactant is a surface-active lipoprotein complex composed of 90% lipids and 10% proteins which include plasma proteins and apolipoproteins SPA, SPB, SPC and SPD. The surfactant is secreted by the alveolar cells of the lung and maintains the stability of pulmonary tissue by reducing the surface tension of fluids that coat the lung. Multiple mutations in this gene have been identified, which cause pulmonary surfactant metabolism dysfunction type 2, also called pulmonary alveolar proteinosis due to surfactant protein C deficiency, and are associated with interstitial lung disease in older infants, children, and adults. Alternatively spliced transcript variants encoding different protein isoforms have been identified. | SFTPC | NA |
| ENSG00000123009 | ENSG00000123009 | NME/NM23 nucleoside diphosphate kinase 2 pseudogene 1 | NA | NME2P1 | NA |
| ENSG00000124785 | 51299 | neuritin 1 | This gene encodes a member of the neuritin family, and is expressed in postmitotic-differentiating neurons of the developmental nervous system and neuronal structures associated with plasticity in the adult. The expression of this gene can be induced by neural activity and neurotrophins. The encoded protein contains a consensus cleavage signal found in glycosylphoshatidylinositol (GPI)-anchored proteins. The encoded protein promotes neurite outgrowth and arborization, suggesting its role in promoting neuritogenesis. Overexpression of the encoded protein may be associated with astrocytoma progression. Alternative splicing results in multiple transcript variants. | NRN1 | NA |
| ENSG00000138161 | 50624 | CUB and zona pellucida like domains 1 | NA | CUZD1 | NA |
| ENSG00000197361 | 283807 | F-box and leucine rich repeat protein 22 | This gene encodes a member of the F-box protein family. This F-box protein interacts with S-phase kinase-associated protein 1A and cullin in order to form SCF complexes which function as ubiquitin ligases. | FBXL22 | NA |
| ENSG00000124225 | 56937 | prostate transmembrane protein, androgen induced 1 | This gene encodes a transmembrane protein that contains a Smad interacting motif (SIM). Expression of this gene is induced by androgens and transforming growth factor beta, and the encoded protein suppresses the androgen receptor and transforming growth factor beta signaling pathways though interactions with Smad proteins. Overexpression of this gene may play a role in multiple types of cancer. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | PMEPA1 | NA |
| ENSG00000104731 | 105371397 | uncharacterized LOC105371397 | NA | LOC105371397 | NA |
| ENSG00000104731 | 54758 | kelch domain containing 4 | NA | KLHDC4 | NA |
| ENSG00000213963 | 100130691 | uncharacterized LOC100130691 | NA | LOC100130691 | NA |
| ENSG00000162390 | 26027 | acyl-CoA thioesterase 11 | This gene encodes a member of the acyl-CoA thioesterase family which catalyse the conversion of activated fatty acids to the corresponding non-esterified fatty acid and coenzyme A. Expression of a mouse homolog in brown adipose tissue is induced by low temperatures and repressed by warm temperatures. Higher levels of expression of the mouse homolog has been found in obesity-resistant mice compared with obesity-prone mice, suggesting a role of acyl-CoA thioesterase 11 in obesity. Alternative splicing results in transcript variants. | ACOT11 | NA |
| ENSG00000139567 | 94 | activin A receptor like type 1 | This gene encodes a type I cell-surface receptor for the TGF-beta superfamily of ligands. It shares with other type I receptors a high degree of similarity in serine-threonine kinase subdomains, a glycine- and serine-rich region (called the GS domain) preceding the kinase domain, and a short C-terminal tail. The encoded protein, sometimes termed ALK1, shares similar domain structures with other closely related ALK or activin receptor-like kinase proteins that form a subfamily of receptor serine/threonine kinases. Mutations in this gene are associated with hemorrhagic telangiectasia type 2, also known as Rendu-Osler-Weber syndrome 2. | ACVRL1 | NA |
| ENSG00000103044 | 3038 | hyaluronan synthase 3 | The protein encoded by this gene is involved in the synthesis of the unbranched glycosaminoglycan hyaluronan, or hyaluronic acid, which is a major constituent of the extracellular matrix. This gene is a member of the NODC/HAS gene family. Compared to the proteins encoded by other members of this gene family, this protein appears to be more of a regulator of hyaluronan synthesis. Alternative splicing results in multiple transcript variants. | HAS3 | NA |
| ENSG00000213280 | ENSG00000213280 | NA | NA | RP11-212P7.1 | NA |
| ENSG00000232774 | 400221 | uncharacterized LOC400221 | NA | FLJ22447 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",7,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[8,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | X_id | query | name | summary | notfound |
|---|---|---|---|---|---|
| CHGB | 1114 | ENSG00000089199 | chromogranin B | This gene encodes a tyrosine-sulfated secretory protein abundant in peptidergic endocrine cells and neurons. This protein may serve as a precursor for regulatory peptides. | NA |
| PRSS1 | 5644 | ENSG00000204983 | protease, serine 1 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | NA |
| SLPI | 6590 | ENSG00000124107 | secretory leukocyte peptidase inhibitor | This gene encodes a secreted inhibitor which protects epithelial tissues from serine proteases. It is found in various secretions including seminal plasma, cervical mucus, and bronchial secretions, and has affinity for trypsin, leukocyte elastase, and cathepsin G. Its inhibitory effect contributes to the immune response by protecting epithelial surfaces from attack by endogenous proteolytic enzymes. This antimicrobial protein has antibacterial, antifungal and antiviral activity. | NA |
| REG1A | 5967 | ENSG00000115386 | regenerating family member 1 alpha | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | NA |
| CELA3A | 10136 | ENSG00000142789 | chymotrypsin like elastase family member 3A | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3A has little elastolytic activity. Like most of the human elastases, elastase 3A is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3A preferentially cleaves proteins after alanine residues. Elastase 3A may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1. | NA |
| MYH6 | 4624 | ENSG00000197616 | myosin, heavy chain 6, cardiac muscle, alpha | Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. | NA |
| BNIPL | 149428 | ENSG00000163141 | BCL2/adenovirus E1B 19kD interacting protein like | The protein encoded by this gene interacts with several other proteins, such as BCL2, ARHGAP1, MIF and GFER. It may function as a bridge molecule between BCL2 and ARHGAP1/CDC42 in promoting cell death. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | NA |
| TNFAIP8L3 | 388121 | ENSG00000183578 | TNF alpha induced protein 8 like 3 | NA | NA |
| PNLIP | 5406 | ENSG00000175535 | pancreatic lipase | This gene is a member of the lipase gene family. It encodes a carboxyl esterase that hydrolyzes insoluble, emulsified triglycerides, and is essential for the efficient digestion of dietary fats. This gene is expressed specifically in the pancreas. | NA |
| CLPS | 1208 | ENSG00000137392 | colipase | The protein encoded by this gene is a cofactor needed by pancreatic lipase for efficient dietary lipid hydrolysis. It binds to the C-terminal, non-catalytic domain of lipase, thereby stabilizing an active conformation and considerably increasing the overall hydrophobic binding site. The gene product allows lipase to anchor noncovalently to the surface of lipid micelles, counteracting the destabilizing influence of intestinal bile salts. This cofactor is only expressed in pancreatic acinar cells, suggesting regulation of expression by tissue-specific elements. Three transcript variants encoding different isoforms have been found for this gene. | NA |
| TRIM29 | 23650 | ENSG00000137699 | tripartite motif containing 29 | The protein encoded by this gene belongs to the TRIM protein family. It has multiple zinc finger motifs and a leucine zipper motif. It has been proposed to form homo- or heterodimers which are involved in nucleic acid binding. Thus, it may act as a transcriptional regulatory factor involved in carcinogenesis and/or differentiation. It may also function in the suppression of radiosensitivity since it is associated with ataxia telangiectasia phenotype. | NA |
| GP2 | 2813 | ENSG00000169347 | glycoprotein 2 | This gene encodes an integral membrane protein that is secreted from intracellular zymogen granules and associates with the plasma membrane via glycosylphosphatidylinositol (GPI) linkage. The encoded protein binds pathogens such as enterobacteria, thereby playing an important role in the innate immune response. The C-terminus of this protein is related to the C-terminus of the protein encoded by the neighboring gene, uromodulin (UMOD). Alternative splicing results in multiple transcript variants. | NA |
| CIDEC | 63924 | ENSG00000187288 | cell death inducing DFFA like effector c | This gene encodes a member of the cell death-inducing DNA fragmentation factor-like effector family. Members of this family play important roles in apoptosis. The encoded protein promotes lipid droplet formation in adipocytes and may mediate adipocyte apoptosis. This gene is regulated by insulin and its expression is positively correlated with insulin sensitivity. Mutations in this gene may contribute to insulin resistant diabetes. A pseudogene of this gene is located on the short arm of chromosome 3. Alternatively spliced transcript variants that encode different isoforms have been observed for this gene. | NA |
| RP11-79P5.10 | ENSG00000255883 | ENSG00000255883 | NA | NA | NA |
| SDC1 | 6382 | ENSG00000115884 | syndecan 1 | The protein encoded by this gene is a transmembrane (type I) heparan sulfate proteoglycan and is a member of the syndecan proteoglycan family. The syndecans mediate cell binding, cell signaling, and cytoskeletal organization and syndecan receptors are required for internalization of the HIV-1 tat protein. The syndecan-1 protein functions as an integral membrane protein and participates in cell proliferation, cell migration and cell-matrix interactions via its receptor for extracellular matrix proteins. Altered syndecan-1 expression has been detected in several different tumor types. While several transcript variants may exist for this gene, the full-length natures of only two have been described to date. These two represent the major variants of this gene and encode the same protein. | NA |
| S100A2 | 6273 | ENSG00000196754 | S100 calcium binding protein A2 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may have a tumor suppressor function. Chromosomal rearrangements and altered expression of this gene have been implicated in breast cancer. | NA |
| REG1B | 5968 | ENSG00000172023 | regenerating family member 1 beta | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV based on the primary structures of the encoded proteins. This gene encodes a protein secreted by the exocrine pancreas that is highly similar to the REG1A protein. The related REG1A protein is associated with islet cell regeneration and diabetogenesis, and may be involved in pancreatic lithogenesis. Reg family members REG1A, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | NA |
| CELA2A | 63036 | ENSG00000142615 | chymotrypsin like elastase family member 2A | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Like most of the human elastases, elastase 2A is secreted from the pancreas as a zymogen. In other species, elastase 2A has been shown to preferentially cleave proteins after leucine, methionine, and phenylalanine residues. | NA |
| REG3A | 5068 | ENSG00000172016 | regenerating family member 3 alpha | This gene encodes a pancreatic secretory protein that may be involved in cell proliferation or differentiation. It has similarity to the C-type lectin superfamily. The enhanced expression of this gene is observed during pancreatic inflammation and liver carcinogenesis. The mature protein also functions as an antimicrobial protein with antibacterial activity. Alternate splicing results in multiple transcript variants that encode the same protein. | NA |
| CCDC3 | 83643 | ENSG00000151468 | coiled-coil domain containing 3 | NA | NA |
| LPL | 4023 | ENSG00000175445 | lipoprotein lipase | LPL encodes lipoprotein lipase, which is expressed in heart, muscle, and adipose tissue. LPL functions as a homodimer, and has the dual functions of triglyceride hydrolase and ligand/bridging factor for receptor-mediated lipoprotein uptake. Severe mutations that cause LPL deficiency result in type I hyperlipoproteinemia, while less extreme mutations in LPL are linked to many disorders of lipoprotein metabolism. | NA |
| TNNI3 | 7137 | ENSG00000129991 | troponin I3, cardiac type | Troponin I (TnI), along with troponin T (TnT) and troponin C (TnC), is one of 3 subunits that form the troponin complex of the thin filaments of striated muscle. TnI is the inhibitory subunit; blocking actin-myosin interactions and thereby mediating striated muscle relaxation. The TnI subfamily contains three genes: TnI-skeletal-fast-twitch, TnI-skeletal-slow-twitch, and TnI-cardiac. This gene encodes the TnI-cardiac protein and is exclusively expressed in cardiac muscle tissues. Mutations in this gene cause familial hypertrophic cardiomyopathy type 7 (CMH7) and familial restrictive cardiomyopathy (RCM). | NA |
| PLA2G1B | 5319 | ENSG00000170890 | phospholipase A2 group IB | This gene encodes a secreted member of the phospholipase A2 (PLA2) class of enzymes, which is produced by the pancreatic acinar cells. The encoded calcium-dependent enzyme catalyzes the hydrolysis of the sn-2 position of membrane glycerophospholipids to release arachidonic acid (AA) and lysophospholipids. AA is subsequently converted by downstream metabolic enzymes to several bioactive lipophilic compounds (eicosanoids), including prostaglandins (PGs) and leukotrienes (LTs). The enzyme may be involved in several physiological processes including cell contraction, cell proliferation and pathological response. | NA |
| PRODH | 5625 | ENSG00000100033 | proline dehydrogenase 1 | This gene encodes a mitochondrial protein that catalyzes the first step in proline degradation. Mutations in this gene are associated with hyperprolinemia type 1 and susceptibility to schizophrenia 4 (SCZD4). This gene is located on chromosome 22q11.21, a region which has also been associated with the contiguous gene deletion syndromes, DiGeorge and CATCH22. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| LIPG | 9388 | ENSG00000101670 | lipase G, endothelial type | The protein encoded by this gene has substantial phospholipase activity and may be involved in lipoprotein metabolism and vascular biology. This protein is designated a member of the TG lipase family by its sequence and characteristic lid region which provides substrate specificity for enzymes of the TG lipase family. | NA |
| NA | NA | ENSG00000250606 | NA | NA | TRUE |
| LDHB | 3945 | ENSG00000111716 | lactate dehydrogenase B | This gene encodes the B subunit of lactate dehydrogenase enzyme, which catalyzes the interconversion of pyruvate and lactate with concomitant interconversion of NADH and NAD+ in a post-glycolysis process. Alternatively spliced transcript variants have been found for this gene. Recent studies have shown that a C-terminally extended isoform is produced by use of an alternative in-frame translation termination codon via a stop codon readthrough mechanism, and that this isoform is localized in the peroxisomes. Mutations in this gene are associated with lactate dehydrogenase B deficiency. Pseudogenes have been identified on chromosomes X, 5 and 13. | NA |
| CTRB2 | 440387 | ENSG00000168928 | chymotrypsinogen B2 | NA | NA |
| PLTP | 5360 | ENSG00000100979 | phospholipid transfer protein | The protein encoded by this gene is one of at least two lipid transfer proteins found in human plasma. The encoded protein transfers phospholipids from triglyceride-rich lipoproteins to high density lipoprotein (HDL). In addition to regulating the size of HDL particles, this protein may be involved in cholesterol metabolism. At least two transcript variants encoding different isoforms have been found for this gene. | NA |
| TRIM63 | 84676 | ENSG00000158022 | tripartite motif containing 63 | This gene encodes a member of the RING zinc finger protein family found in striated muscle and iris. The product of this gene is an E3 ubiquitin ligase that localizes to the Z-line and M-line lattices of myofibrils. This protein plays an important role in the atrophy of skeletal and cardiac muscle and is required for the degradation of myosin heavy chain proteins, myosin light chain, myosin binding protein, and for muscle-type creatine kinase. | NA |
| CSRP3 | 8048 | ENSG00000129170 | cysteine and glycine rich protein 3 | This gene encodes a member of the CSRP family of LIM domain proteins, which may be involved in regulatory processes important for development and cellular differentiation. The LIM/double zinc-finger motif found in this protein is found in a group of proteins with critical functions in gene regulation, cell growth, and somatic differentiation. Mutations in this gene are thought to cause heritable forms of hypertrophic cardiomyopathy (HCM) and dilated cardiomyopathy (DCM) in humans. Alternatively spliced transcript variants with different 5’ UTR, but encoding the same protein, have been found for this gene. | NA |
| LY6G6C | 80740 | ENSG00000204421 | lymphocyte antigen 6 complex, locus G6C | LY6G6C belongs to a cluster of leukocyte antigen-6 (LY6) genes located in the major histocompatibility complex (MHC) class III region on chromosome 6. Members of the LY6 superfamily typically contain 70 to 80 amino acids, including 8 to 10 cysteines. Most LY6 proteins are attached to the cell surface by a glycosylphosphatidylinositol (GPI) anchor that is directly involved in signal transduction (Mallya et al., 2002 [PubMed 12079290]). | NA |
| IL34 | 146433 | ENSG00000157368 | interleukin 34 | Interleukin-34 is a cytokine that promotes the differentiation and viability of monocytes and macrophages through the colony-stimulating factor-1 receptor (CSF1R; MIM 164770) (Lin et al., 2008 [PubMed 18467591]). | NA |
| ISM1 | 140862 | ENSG00000101230 | isthmin 1, angiogenesis inhibitor | NA | NA |
| ETS2 | 2114 | ENSG00000157557 | ETS proto-oncogene 2, transcription factor | This gene encodes a transcription factor which regulates genes involved in development and apoptosis. The encoded protein is also a protooncogene and shown to be involved in regulation of telomerase. A pseudogene of this gene is located on the X chromosome. Alternative splicing results in multiple transcript variants. | NA |
| CTRB1 | 1504 | ENSG00000168925 | chymotrypsinogen B1 | The protein encoded by this gene is one of a family of serine proteases that is secreted into the gastrointestinal tract as an inactive precursor, which is activated by proteolytic cleavage with trypsin. | NA |
| RP11-34P1.2 | ENSG00000254373 | ENSG00000254373 | NA | NA | NA |
| CYSRT1 | 375791 | ENSG00000197191 | cysteine rich tail 1 | NA | NA |
| SERPINB8 | 5271 | ENSG00000166401 | serpin family B member 8 | The superfamily of high molecular weight serine proteinase inhibitors (serpins) regulate a diverse set of intracellular and extracellular processes such as complement activation, fibrinolysis, coagulation, cellular differentiation, tumor suppression, apoptosis, and cell migration. Serpins are characterized by well-conserved a tertiary structure that consists of 3 beta sheets and 8 or 9 alpha helices (Huber and Carrell, 1989 [PubMed 2690952]). A critical portion of the molecule, the reactive center loop connects beta sheets A and C. Protease inhibitor-8 (PI8; SERPINB8) is a member of the ov-serpin subfamily, which, relative to the archetypal serpin PI1 (MIM 107400), is characterized by a high degree of homology to chicken ovalbumin, lack of N- and C-terminal extensions, absence of a signal peptide, and a serine rather than an asparagine residue at the penultimate position (summary by Bartuski et al., 1997 [PubMed 9268635]). | NA |
| CSDC2 | 27254 | ENSG00000172346 | cold shock domain containing C2 | NA | NA |
| IL20RB | 53833 | ENSG00000174564 | interleukin 20 receptor subunit beta | IL20RB and IL20RA (MIM 605620) form a heterodimeric receptor for interleukin-20 (IL20; MIM 605619) (Blumberg et al., 2001 [PubMed 11163236]). | NA |
| MDK | 4192 | ENSG00000110492 | midkine (neurite growth-promoting factor 2) | This gene encodes a member of a small family of secreted growth factors that binds heparin and responds to retinoic acid. The encoded protein promotes cell growth, migration, and angiogenesis, in particular during tumorigenesis. This gene has been targeted as a therapeutic for a variety of different disorders. Alternatively spliced transcript variants encoding multiple isoforms have been observed. | NA |
| RP11-315I20.3 | ENSG00000244619 | ENSG00000244619 | NA | NA | NA |
| ITGA10 | 8515 | ENSG00000143127 | integrin subunit alpha 10 | Integrins are integral transmembrane glycoproteins composed of noncovalently linked alpha and beta chains. They participate in cell adhesion as well as cell-surface mediated signalling. This gene encodes an integrin alpha chain and is expressed at high levels in chondrocytes, where it is transcriptionally regulated by AP-2epsilon and Ets-1. The protein encoded by this gene binds to collagen. Alternative splicing results in multiple transcript variants. | NA |
| LIPE | 3991 | ENSG00000079435 | lipase E, hormone sensitive type | The protein encoded by this gene has a long and a short form, generated by use of alternative translational start codons. The long form is expressed in steroidogenic tissues such as testis, where it converts cholesteryl esters to free cholesterol for steroid hormone production. The short form is expressed in adipose tissue, among others, where it hydrolyzes stored triglycerides to free fatty acids. | NA |
| PNLIPRP1 | 5407 | ENSG00000187021 | pancreatic lipase related protein 1 | NA | NA |
| AZGP1 | 563 | ENSG00000160862 | alpha-2-glycoprotein 1, zinc-binding | NA | NA |
| RNF144A | 9781 | ENSG00000151692 | ring finger protein 144A | The protein encoded by this protein contains a RING finger, a motif known to be involved in protein-DNA and protein-protein interactions. The mouse counterpart of this protein has been shown to interact with Ube2l3/UbcM4, which is an ubiquitin-conjugating enzyme involved in embryonic development. | NA |
| C3orf18 | 51161 | ENSG00000088543 | chromosome 3 open reading frame 18 | NA | NA |
| OSBPL6 | 114880 | ENSG00000079156 | oxysterol binding protein like 6 | This gene encodes a member of the oxysterol-binding protein (OSBP) family, a group of intracellular lipid receptors. Most members contain an N-terminal pleckstrin homology domain and a highly conserved C-terminal OSBP-like sterol-binding domain. Transcript variants encoding different isoforms have been identified. | NA |
| MYH7 | 4625 | ENSG00000092054 | myosin, heavy chain 7, cardiac muscle, beta | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | NA |
| SCAMP5 | 192683 | ENSG00000198794 | secretory carrier membrane protein 5 | NA | NA |
| NA | NA | ENSG00000156750 | NA | NA | TRUE |
| QPRT | 23475 | ENSG00000103485 | quinolinate phosphoribosyltransferase | This gene encodes a key enzyme in catabolism of quinolinate, an intermediate in the tryptophan-nicotinamide adenine dinucleotide pathway. Quinolinate acts as a most potent endogenous exitotoxin to neurons. Elevation of quinolinate levels in the brain has been linked to the pathogenesis of neurodegenerative disorders such as epilepsy, Alzheimer’s disease, and Huntington’s disease. Alternative splicing results in multiple transcript variants. | NA |
| IL1RN | 3557 | ENSG00000136689 | interleukin 1 receptor antagonist | The protein encoded by this gene is a member of the interleukin 1 cytokine family. This protein inhibits the activities of interleukin 1, alpha (IL1A) and interleukin 1, beta (IL1B), and modulates a variety of interleukin 1 related immune and inflammatory responses. This gene and five other closely related cytokine genes form a gene cluster spanning approximately 400 kb on chromosome 2. A polymorphism of this gene is reported to be associated with increased risk of osteoporotic fractures and gastric cancer. Several alternatively spliced transcript variants encoding distinct isoforms have been reported. | NA |
| PKP2 | 5318 | ENSG00000057294 | plakophilin 2 | This gene encodes a member of the arm-repeat (armadillo) and plakophilin gene families. Plakophilin proteins contain numerous armadillo repeats, localize to cell desmosomes and nuclei, and participate in linking cadherins to intermediate filaments in the cytoskeleton. This gene product may regulate the signaling activity of beta-catenin. Two alternately spliced transcripts encoding two protein isoforms have been identified. A processed pseudogene with high similarity to this locus has been mapped to chromosome 12p13. | NA |
| MT1X | 4501 | ENSG00000187193 | metallothionein 1X | NA | NA |
| LOC257396 | 257396 | ENSG00000247796 | uncharacterized LOC257396 | NA | NA |
| PPP1R1B | 84152 | ENSG00000131771 | protein phosphatase 1 regulatory inhibitor subunit 1B | This gene encodes a bifunctional signal transduction molecule. Dopaminergic and glutamatergic receptor stimulation regulates its phosphorylation and function as a kinase or phosphatase inhibitor. As a target for dopamine, this gene may serve as a therapeutic target for neurologic and psychiatric disorders. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| LINC01277 | ENSG00000229017 | ENSG00000229017 | long intergenic non-protein coding RNA 1277 | NA | NA |
| CELA3B | 23436 | ENSG00000219073 | chymotrypsin like elastase family member 3B | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3B has little elastolytic activity. Like most of the human elastases, elastase 3B is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3B preferentially cleaves proteins after alanine residues. Elastase 3B may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1, and excretion of this protein in fecal material is frequently used as a measure of pancreatic function in clinical assays. | NA |
| LOC105370792 | 105370792 | ENSG00000174171 | uncharacterized LOC105370792 | NA | NA |
| CST6 | 1474 | ENSG00000175315 | cystatin E/M | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins and the kininogens. The type 2 cystatin proteins are a class of cysteine proteinase inhibitors found in a variety of human fluids and secretions, where they appear to provide protective functions. This gene encodes a cystatin from the type 2 family, which is down-regulated in metastatic breast tumor cells as compared to primary tumor cells. Loss of expression is likely associated with the progression of a primary tumor to a metastatic phenotype. | NA |
| CTD-2201G16.1 | ENSG00000258444 | ENSG00000258444 | NA | NA | NA |
| CIB2 | 10518 | ENSG00000136425 | calcium and integrin binding family member 2 | The protein encoded by this gene is similar to that of KIP/CIB, calcineurin B, and calmodulin. The encoded protein is a calcium-binding regulatory protein that interacts with DNA-dependent protein kinase catalytic subunits (DNA-PKcs), and it is involved in photoreceptor cell maintenance. Mutations in this gene cause deafness, autosomal recessive, 48 (DFNB48), and also Usher syndrome 1J (USH1J). Alternative splicing results in multiple transcript variants. | NA |
| RP11-343H19.2 | ENSG00000259827 | ENSG00000259827 | NA | NA | NA |
| UQCRHL | 440567 | ENSG00000233954 | ubiquinol-cytochrome c reductase hinge protein like | This gene has characteristics of a pseudogene derived from the UQCRH gene. However, there is still an open reading frame that could produce a protein of the same or nearly the same size as that of the UQCRH gene, so this gene is being called protein-coding for now. | NA |
| KIAA1671 | 85379 | ENSG00000197077 | KIAA1671 | NA | NA |
| SLC29A4 | 222962 | ENSG00000164638 | solute carrier family 29 member 4 | This gene encodes a member of the SLC29A/ENT transporter protein family. The encoded membrane protein catalyzes the reuptake of monoamines into presynaptic neurons, thus determining the intensity and duration of monoamine neural signaling. It has been shown to transport several compounds, including serotonin, dopamine, and the neurotoxin 1-methyl-4-phenylpyridinium. Alternative splicing results in multiple transcript variants. | NA |
| CELA2B | 51032 | ENSG00000215704 | chymotrypsin like elastase family member 2B | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Like most of the human elastases, elastase 2B is secreted from the pancreas as a zymogen. In other species, elastase 2B has been shown to preferentially cleave proteins after leucine, methionine, and phenylalanine residues. | NA |
| SLC47A1 | 55244 | ENSG00000142494 | solute carrier family 47 member 1 | This gene is located within the Smith-Magenis syndrome region on chromosome 17. It encodes a protein of unknown function. | NA |
| PACSIN1 | 29993 | ENSG00000124507 | protein kinase C and casein kinase substrate in neurons 1 | NA | NA |
| PTTG1 | 9232 | ENSG00000164611 | pituitary tumor-transforming 1 | The encoded protein is a homolog of yeast securin proteins, which prevent separins from promoting sister chromatid separation. It is an anaphase-promoting complex (APC) substrate that associates with a separin until activation of the APC. The gene product has transforming activity in vitro and tumorigenic activity in vivo, and the gene is highly expressed in various tumors. The gene product contains 2 PXXP motifs, which are required for its transforming and tumorigenic activities, as well as for its stimulation of basic fibroblast growth factor expression. It also contains a destruction box (D box) that is required for its degradation by the APC. The acidic C-terminal region of the encoded protein can act as a transactivation domain. The gene product is mainly a cytosolic protein, although it partially localizes in the nucleus. Three transcript variants encoding the same protein have been found for this gene. | NA |
| PXDC1 | 221749 | ENSG00000168994 | PX domain containing 1 | NA | NA |
| VLDLR-AS1 | 401491 | ENSG00000236404 | VLDLR antisense RNA 1 | NA | NA |
| DUSP4 | 1846 | ENSG00000120875 | dual specificity phosphatase 4 | The protein encoded by this gene is a member of the dual specificity protein phosphatase subfamily. These phosphatases inactivate their target kinases by dephosphorylating both the phosphoserine/threonine and phosphotyrosine residues. They negatively regulate members of the mitogen-activated protein (MAP) kinase superfamily (MAPK/ERK, SAPK/JNK, p38), which are associated with cellular proliferation and differentiation. Different members of the family of dual specificity phosphatases show distinct substrate specificities for various MAP kinases, different tissue distribution and subcellular localization, and different modes of inducibility of their expression by extracellular stimuli. This gene product inactivates ERK1, ERK2 and JNK, is expressed in a variety of tissues, and is localized in the nucleus. Two alternatively spliced transcript variants, encoding distinct isoforms, have been observed for this gene. In addition, multiple polyadenylation sites have been reported. | NA |
| KRT16 | 3868 | ENSG00000186832 | keratin 16 | The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains and are clustered in a region of chromosome 17q12-q21. This keratin has been coexpressed with keratin 14 in a number of epithelial tissues, including esophagus, tongue, and hair follicles. Mutations in this gene are associated with type 1 pachyonychia congenita, non-epidermolytic palmoplantar keratoderma and unilateral palmoplantar verrucous nevus. | NA |
| CEL | 1056 | ENSG00000170835 | carboxyl ester lipase | The protein encoded by this gene is a glycoprotein secreted from the pancreas into the digestive tract and from the lactating mammary gland into human milk. The physiological role of this protein is in cholesterol and lipid-soluble vitamin ester hydrolysis and absorption. This encoded protein promotes large chylomicron production in the intestine. Also its presence in plasma suggests its interactions with cholesterol and oxidized lipoproteins to modulate the progression of atherosclerosis. In pancreatic tumoral cells, this encoded protein is thought to be sequestrated within the Golgi compartment and is probably not secreted. This gene contains a variable number of tandem repeat (VNTR) polymorphism in the coding region that may influence the function of the encoded protein. | NA |
| SBSN | 374897 | ENSG00000189001 | suprabasin | NA | NA |
| PDIA2 | 64714 | ENSG00000185615 | protein disulfide isomerase family A member 2 | Protein disulfide isomerases (EC 5.3.4.1), such as PDIP, are endoplasmic reticulum (ER) resident proteins that catalyze protein folding and thiol-disulfide interchange reactions (Desilva et al., 1996 [PubMed 8561901]). | NA |
| GADD45A | 1647 | ENSG00000116717 | growth arrest and DNA damage inducible alpha | This gene is a member of a group of genes whose transcript levels are increased following stressful growth arrest conditions and treatment with DNA-damaging agents. The protein encoded by this gene responds to environmental stresses by mediating activation of the p38/JNK pathway via MTK1/MEKK4 kinase. The DNA damage-induced transcription of this gene is mediated by both p53-dependent and -independent mechanisms. Alternatively spliced transcript variants encoding distinct isoforms have been found for this gene. | NA |
| SPAG4 | 6676 | ENSG00000061656 | sperm associated antigen 4 | The mammalian sperm flagellum contains two cytoskeletal structures associated with the axoneme: the outer dense fibers surrounding the axoneme in the midpiece and principal piece and the fibrous sheath surrounding the outer dense fibers in the principal piece of the tail. Defects in these structures are associated with abnormal tail morphology, reduced sperm motility, and infertility. In the rat, the protein encoded by this gene associates with an outer dense fiber protein via a leucine zipper motif and localizes to the microtubules of the manchette and axoneme during sperm tail development. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| TG | 7038 | ENSG00000042832 | thyroglobulin | Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. | NA |
| LIPF | 8513 | ENSG00000182333 | lipase F, gastric type | This gene encodes gastric lipase, an enzyme involved in the digestion of dietary triglycerides in the gastrointestinal tract, and responsible for 30% of fat digestion processes occurring in human. It is secreted by gastric chief cells in the fundic mucosa of the stomach, and it hydrolyzes the ester bonds of triglycerides under acidic pH conditions. The gene is a member of a conserved gene family of lipases that play distinct roles in neutral lipid metabolism. Several transcript variants encoding different isoforms have been found for this gene. | NA |
| THBS4 | 7060 | ENSG00000113296 | thrombospondin 4 | The protein encoded by this gene belongs to the thrombospondin protein family. Thrombospondin family members are adhesive glycoproteins that mediate cell-to-cell and cell-to-matrix interactions. This protein forms a pentamer and can bind to heparin and calcium. It is involved in local signaling in the developing and adult nervous system, and it contributes to spinal sensitization and neuropathic pain states. This gene is activated during the stromal response to invasive breast cancer. It may also play a role in inflammatory responses in Alzheimer’s disease. Alternative splicing results in multiple transcript variants. | NA |
| MMP11 | 4320 | ENSG00000099953 | matrix metallopeptidase 11 | Proteins of the matrix metalloproteinase (MMP) family are involved in the breakdown of extracellular matrix in normal physiological processes, such as embryonic development, reproduction, and tissue remodeling, as well as in disease processes, such as arthritis and metastasis. Most MMP’s are secreted as inactive proproteins which are activated when cleaved by extracellular proteinases. However, the enzyme encoded by this gene is activated intracellularly by furin within the constitutive secretory pathway. Also in contrast to other MMP’s, this enzyme cleaves alpha 1-proteinase inhibitor but weakly degrades structural proteins of the extracellular matrix. | NA |
| LGALS7B | 653499 | ENSG00000178934 | galectin 7B | The galectins are a family of beta-galactoside-binding proteins implicated in modulating cell-cell and cell-matrix interactions. Differential and in situ hybridization studies indicate that this lectin is specifically expressed in keratinocytes and found mainly in stratified squamous epithelium. A duplicate copy of this gene (GeneID:3963) is found adjacent to, but on the opposite strand on chromosome 19. | NA |
| NA | NA | ENSG00000165862 | NA | NA | TRUE |
| MYBPC2 | 4606 | ENSG00000086967 | myosin binding protein C, fast type | This gene encodes a member of the myosin-binding protein C family. This family includes the fast-, slow- and cardiac-type isoforms, each of which is a myosin-associated protein found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The protein encoded by this locus is referred to as the fast-type isoform. Mutations in the related but distinct genes encoding the slow-type and cardiac-type isoforms have been associated with distal arthrogryposis, type 1 and hypertrophic cardiomyopathy, respectively. | NA |
| SOWAHC | 65124 | ENSG00000198142 | sosondowah ankyrin repeat domain family member C | NA | NA |
| STAMBPL1 | 57559 | ENSG00000138134 | STAM binding protein like 1 | NA | NA |
| PGF | 5228 | ENSG00000119630 | placental growth factor | This gene encodes a growth factor found in placenta which is homologous to vascular endothelial growth factor. Alternatively spliced transcripts encoding different isoforms have been found for this gene. | NA |
| NEDD4 | 4734 | ENSG00000069869 | neural precursor cell expressed, developmentally down-regulated 4, E3 ubiquitin protein ligase | NA | NA |
| RP11-256I23.1 | ENSG00000268896 | ENSG00000268896 | NA | NA | NA |
| PRTFDC1 | 56952 | ENSG00000099256 | phosphoribosyl transferase domain containing 1 | NA | NA |
| PTPRN2 | 5799 | ENSG00000155093 | protein tyrosine phosphatase, receptor type N2 | This gene encodes a protein with sequence similarity to receptor-like protein tyrosine phosphatases. However, tyrosine phosphatase activity has not been experimentally validated for this protein. Studies of the rat ortholog suggest that the encoded protein may instead function as a phosphatidylinositol phosphatase with the ability to dephosphorylate phosphatidylinositol 3-phosphate and phosphatidylinositol 4,5-diphosphate, and this function may be involved in the regulation of insulin secretion. This protein has been identified as an autoantigen in insulin-dependent diabetes mellitus. Alternative splicing results in multiple transcript variants. | NA |
| TCAP | 8557 | ENSG00000173991 | titin-cap | Sarcomere assembly is regulated by the muscle protein titin. Titin is a giant elastic protein with kinase activity that extends half the length of a sarcomere. It serves as a scaffold to which myofibrils and other muscle related proteins are attached. This gene encodes a protein found in striated and cardiac muscle that binds to the titin Z1-Z2 domains and is a substrate of titin kinase, interactions thought to be critical to sarcomere assembly. Mutations in this gene are associated with limb-girdle muscular dystrophy type 2G. | NA |
| FITM1 | 161247 | ENSG00000139914 | fat storage inducing transmembrane protein 1 | FIT1 belongs to an evolutionarily conserved family of proteins involved in fat storage (Kadereit et al., 2008 [PubMed 18160536]). | NA |
| RAB6B | 51560 | ENSG00000154917 | RAB6B, member RAS oncogene family | NA | NA |
| GRAMD1B | 57476 | ENSG00000023171 | GRAM domain containing 1B | NA | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",8,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[9,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| name | X_id | summary | symbol | query | notfound |
|---|---|---|---|---|---|
| chromogranin A | 1113 | The protein encoded by this gene is a member of the chromogranin/secretogranin family of neuroendocrine secretory proteins. It is found in secretory vesicles of neurons and endocrine cells. This gene product is a precursor to three biologically active peptides; vasostatin, pancreastatin, and parastatin. These peptides act as autocrine or paracrine negative modulators of the neuroendocrine system. Two other peptides, catestatin and chromofungin, have antimicrobial activity and antifungal activity, respectively. Two transcript variants encoding different isoforms have been found for this gene. | CHGA | ENSG00000100604 | NA |
| protease, serine 3 | 5646 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is expressed in the brain and pancreas and is resistant to common trypsin inhibitors. It is active on peptide linkages involving the carboxyl group of lysine or arginine. This gene is localized to the locus of T cell receptor beta variable orphans on chromosome 9. Four transcript variants encoding different isoforms have been described for this gene. | PRSS3 | ENSG00000010438 | NA |
| kinesin family member 5A | 3798 | This gene encodes a member of the kinesin family of proteins. Members of this family are part of a multisubunit complex that functions as a microtubule motor in intracellular organelle transport. Mutations in this gene cause autosomal dominant spastic paraplegia 10. | KIF5A | ENSG00000155980 | NA |
| carboxypeptidase A2 | 1358 | Three different forms of human pancreatic procarboxypeptidase A have been isolated. The encoded protein represents the A2 form, which is a monomeric protein with different biochemical properties from the A1 and A3 forms. The A2 form of pancreatic procarboxypeptidase acts on aromatic C-terminal residues and is a secreted protein. | CPA2 | ENSG00000158516 | NA |
| epithelial cell adhesion molecule | 4072 | This gene encodes a carcinoma-associated antigen and is a member of a family that includes at least two type I membrane proteins. This antigen is expressed on most normal epithelial cells and gastrointestinal carcinomas and functions as a homotypic calcium-independent cell adhesion molecule. The antigen is being used as a target for immunotherapy treatment of human carcinomas. Mutations in this gene result in congenital tufting enteropathy. | EPCAM | ENSG00000119888 | NA |
| ST3 beta-galactoside alpha-2,3-sialyltransferase 6 | 10402 | The protein encoded by this gene is a member of the sialyltransferase family. Members of this family are enzymes that transfer sialic acid from the activated cytidine 5’-monophospho-N-acetylneuraminic acid to terminal positions on sialylated glycolipids (gangliosides) or to the N- or O-linked sugar chains of glycoproteins. This protein has high specificity for neolactotetraosylceramide and neolactohexaosylceramide as glycolipid substrates and may contribute to the formation of selectin ligands and sialyl Lewis X, a carbohydrate important for cell-to-cell recognition and a blood group antigen. | ST3GAL6 | ENSG00000064225 | NA |
| ATPase Na+/K+ transporting subunit beta 1 | 481 | The protein encoded by this gene belongs to the family of Na+/K+ and H+/K+ ATPases beta chain proteins, and to the subfamily of Na+/K+ -ATPases. Na+/K+ -ATPase is an integral membrane protein responsible for establishing and maintaining the electrochemical gradients of Na and K ions across the plasma membrane. These gradients are essential for osmoregulation, for sodium-coupled transport of a variety of organic and inorganic molecules, and for electrical excitability of nerve and muscle. This enzyme is composed of two subunits, a large catalytic subunit (alpha) and a smaller glycoprotein subunit (beta). The beta subunit regulates, through assembly of alpha/beta heterodimers, the number of sodium pumps transported to the plasma membrane. The glycoprotein subunit of Na+/K+ -ATPase is encoded by multiple genes. This gene encodes a beta 1 subunit. Alternatively spliced transcript variants encoding different isoforms have been described, but their biological validity is not known. | ATP1B1 | ENSG00000143153 | NA |
| procollagen C-endopeptidase enhancer 2 | 26577 | NA | PCOLCE2 | ENSG00000163710 | NA |
| hes family bHLH transcription factor 6 | 55502 | This gene encodes a member of a subfamily of basic helix-loop-helix transcription repressors that have homology to the Drosophila enhancer of split genes. Members of this gene family regulate cell differentiation in numerous cell types. The protein encoded by this gene functions as a cofactor, interacting with other transcription factors through a tetrapeptide domain in its C-terminus. Alternatively spliced transcript variants encoding different isoforms have been described. | HES6 | ENSG00000144485 | NA |
| syntaphilin | 9751 | Syntaxin-1, synaptobrevin/VAMP, and SNAP25 interact to form the SNARE complex, which is required for synaptic vesicle docking and fusion. The protein encoded by this gene is membrane-associated and inhibits SNARE complex formation by binding free syntaxin-1. Expression of this gene appears to be brain-specific. Alternative splicing results in multiple transcript variants encoding different isoforms. | SNPH | ENSG00000101298 | NA |
| neuralized E3 ubiquitin protein ligase 1 | 9148 | NA | NEURL1 | ENSG00000107954 | NA |
| aquaporin 9 | 366 | The aquaporins are a family of water-selective membrane channels. This gene encodes a member of a subset of aquaporins called the aquaglyceroporins. This protein allows passage of a broad range of noncharged solutes and also stimulates urea transport and osmotic water permeability. This protein may also facilitate the uptake of glycerol in hepatic tissue . The encoded protein may also play a role in specialized leukocyte functions such as immunological response and bactericidal activity. Alternate splicing results in multiple transcript variants. | AQP9 | ENSG00000103569 | NA |
| prune homolog 2 | 158471 | The protein encoded by this gene belongs to the B-cell CLL/lymphoma 2 and adenovirus E1B 19 kDa interacting family, whose members play roles in many cellular processes including apotosis, cell transformation, and synaptic function. Several functions for this protein have been demonstrated including suppression of Ras homolog family member A activity, which results in reduced stress fiber formation and suppression of oncogenic cellular transformation. A high molecular weight isoform of this protein has also been shown to colocalize with Adaptor protein complex 2, beta-Adaptin and endodermal markers, suggesting an involvement in post-endocytic trafficking. In prostate cancer cells, this gene acts as a tumor suppressor and its expression is regulated by prostate cancer antigen 3, a non-protein coding gene on the opposite DNA strand in an intron of this gene. Prostate cancer antigen 3 regulates levels of this gene through formation of a double-stranded RNA that undergoes adenosine deaminase actin on RNA-dependent adenosine-to-inosine RNA editing. Alternative splicing results in multiple transcript variants. | PRUNE2 | ENSG00000106772 | NA |
| progastricsin | 5225 | This gene encodes an aspartic proteinase that belongs to the peptidase family A1. The encoded protein is a digestive enzyme that is produced in the stomach and constitutes a major component of the gastric mucosa. This protein is also secreted into the serum. This protein is synthesized as an inactive zymogen that includes a highly basic prosegment. This enzyme is converted into its active mature form at low pH by sequential cleavage of the prosegment that is carried out by the enzyme itself. Polymorphisms in this gene are associated with susceptibility to gastric cancers. Serum levels of this enzyme are used as a biomarker for certain gastric diseases including Helicobacter pylori related gastritis. Alternate splicing results in multiple transcript variants. A pseudogene of this gene is found on chromosome 1. | PGC | ENSG00000096088 | NA |
| phosphorylase kinase, alpha 1 pseudogene 1 | ENSG00000232882 | NA | PHKA1P1 | ENSG00000232882 | NA |
| lin-7 homolog A, crumbs cell polarity complex component | 8825 | The protein encoded by this gene is involved in generating and maintaining the asymmetric distribution of channels and receptors at the cell membrane. The encoded protein also is required for the localization of some specific channels and can be part of a protein complex that couples synaptic vesicle exocytosis to cell adhesion in the brain. | LIN7A | ENSG00000111052 | NA |
| PILR alpha associated neural protein | 196500 | This gene encodes a ligand for the paired immunoglobin-like type 2 receptor alpha, and so may be involved in immune regulation. Alternate splicing results in multiple transcript variants encoding different proteins. | PIANP | ENSG00000139200 | NA |
| immunoglobulin heavy constant alpha 1 | ENSG00000211895 | NA | IGHA1 | ENSG00000211895 | NA |
| glutathione peroxidase 2 | 2877 | This gene is a member of the glutathione peroxidase family and encodes a selenium-dependent glutathione peroxidase that is one of two isoenzymes responsible for the majority of the glutathione-dependent hydrogen peroxide-reducing activity in the epithelium of the gastrointestinal tract. The protein encoded by this locus contains a selenocysteine (Sec) residue encoded by the UGA codon, which normally signals translation termination. Alternatively spliced transcript variants have been described. | GPX2 | ENSG00000176153 | NA |
| immunoglobulin heavy constant alpha 2 (A2m marker) | ENSG00000211890 | NA | IGHA2 | ENSG00000211890 | NA |
| potassium calcium-activated channel subfamily M alpha 1 | 3778 | MaxiK channels are large conductance, voltage and calcium-sensitive potassium channels which are fundamental to the control of smooth muscle tone and neuronal excitability. MaxiK channels can be formed by 2 subunits: the pore-forming alpha subunit, which is the product of this gene, and the modulatory beta subunit. Intracellular calcium regulates the physical association between the alpha and beta subunits. Alternatively spliced transcript variants encoding different isoforms have been identified. | KCNMA1 | ENSG00000156113 | NA |
| NA | NA | NA | NA | ENSG00000156750 | TRUE |
| prostate stem cell antigen | 8000 | This gene encodes a glycosylphosphatidylinositol-anchored cell membrane glycoprotein. In addition to being highly expressed in the prostate it is also expressed in the bladder, placenta, colon, kidney, and stomach. This gene is up-regulated in a large proportion of prostate cancers and is also detected in cancers of the bladder and pancreas. This gene includes a polymorphism that results in an upstream start codon in some individuals; this polymorphism is thought to be associated with a risk for certain gastric and bladder cancers. Alternative splicing results in multiple transcript variants. | PSCA | ENSG00000167653 | NA |
| myosin light chain 2 | 4633 | Thus gene encodes the regulatory light chain associated with cardiac myosin beta (or slow) heavy chain. Ca+ triggers the phosphorylation of regulatory light chain that in turn triggers contraction. Mutations in this gene are associated with mid-left ventricular chamber type hypertrophic cardiomyopathy. | MYL2 | ENSG00000111245 | NA |
| transmembrane protein 158 (gene/pseudogene) | 25907 | Constitutive activation of the Ras pathway triggers an irreversible proliferation arrest reminiscent of replicative senescence. Transcription of this gene is upregulated in response to activation of the Ras pathway, but not under other conditions that induce senescence. The encoded protein is similar to a rat cell surface receptor proposed to function in a neuronal survival pathway. An allelic polymorphism in this gene results in both functional and non-functional (frameshifted) alleles; the reference genome represents the functional allele. | TMEM158 | ENSG00000249992 | NA |
| NA | ENSG00000261534 | NA | RP11-244O19.1 | ENSG00000261534 | NA |
| regulator of G-protein signaling 9 | 8787 | This gene encodes a member of the RGS family of GTPase activating proteins that function in various signaling pathways by accelerating the deactivation of G proteins. This protein is anchored to photoreceptor membranes in retinal cells and deactivates G proteins in the rod and cone phototransduction cascades. Mutations in this gene result in bradyopsia. Multiple transcript variants encoding different isoforms have been found for this gene. | RGS9 | ENSG00000108370 | NA |
| PITPNM family member 3 | 83394 | This gene encodes a member of a family of membrane-associated phosphatidylinositol transfer domain-containing proteins. The calcium-binding protein has phosphatidylinositol (PI) transfer activity and interacts with the protein tyrosine kinase PTK2B (also known as PYK2). The protein is homologous to a Drosophila protein that is implicated in the visual transduction pathway in flies. Mutations in this gene result in autosomal dominant cone dystrophy. Multiple transcript variants encoding different isoforms have been found for this gene. | PITPNM3 | ENSG00000091622 | NA |
| immunoglobulin lambda like polypeptide 5 | 100423062 | This gene encodes one of the immunoglobulin lambda-like polypeptides. It is located within the immunoglobulin lambda locus but it does not require somatic rearrangement for expression. The first exon of this gene is unrelated to immunoglobulin variable genes; the second and third exons are the immunoglobulin lambda joining 1 and the immunoglobulin lambda constant 1 gene segments. Alternative splicing results in multiple transcript variants. | IGLL5 | ENSG00000254709 | NA |
| prominin 2 | 150696 | This gene encodes a member of the prominin family of pentaspan membrane glycoproteins. The encoded protein localizes to basal epithelial cells and may be involved in the organization of plasma membrane microdomains. Alternative splicing results in multiple transcript variants. | PROM2 | ENSG00000155066 | NA |
| stratifin | 2810 | NA | SFN | ENSG00000175793 | NA |
| transmembrane protein 59 like | 25789 | This gene encodes a predicted type-I membrane glycoprotein. The encoded protein may play a role in functioning of the central nervous system. | TMEM59L | ENSG00000105696 | NA |
| polymeric immunoglobulin receptor | 5284 | This gene is a member of the immunoglobulin superfamily. The encoded poly-Ig receptor binds polymeric immunoglobulin molecules at the basolateral surface of epithelial cells; the complex is then transported across the cell to be secreted at the apical surface. A significant association was found between immunoglobulin A nephropathy and several SNPs in this gene. | PIGR | ENSG00000162896 | NA |
| myosin light chain 3 | 4634 | MYL3 encodes myosin light chain 3, an alkali light chain also referred to in the literature as both the ventricular isoform and the slow skeletal muscle isoform. Mutations in MYL3 have been identified as a cause of mid-left ventricular chamber type hypertrophic cardiomyopathy. | MYL3 | ENSG00000160808 | NA |
| fucosyltransferase 2 | 2524 | The protein encoded by this gene is a Golgi stack membrane protein that is involved in the creation of a precursor of the H antigen, which is required for the final step in the soluble A and B antigen synthesis pathway. This gene is one of two encoding the galactoside 2-L-fucosyltransferase enzyme. Two transcript variants encoding the same protein have been found for this gene. | FUT2 | ENSG00000176920 | NA |
| mucin 1, cell surface associated | 4582 | This gene encodes a membrane-bound protein that is a member of the mucin family. Mucins are O-glycosylated proteins that play an essential role in forming protective mucous barriers on epithelial surfaces. These proteins also play a role in intracellular signaling. This protein is expressed on the apical surface of epithelial cells that line the mucosal surfaces of many different tissues including lung, breast stomach and pancreas. This protein is proteolytically cleaved into alpha and beta subunits that form a heterodimeric complex. The N-terminal alpha subunit functions in cell-adhesion and the C-terminal beta subunit is involved in cell signaling. Overexpression, aberrant intracellular localization, and changes in glycosylation of this protein have been associated with carcinomas. This gene is known to contain a highly polymorphic variable number tandem repeats (VNTR) domain. Alternate splicing results in multiple transcript variants. | MUC1 | ENSG00000185499 | NA |
| leucine rich repeat containing 4B | 94030 | NA | LRRC4B | ENSG00000131409 | NA |
| neogenin 1 | 4756 | This gene encodes a cell surface protein that is a member of the immunoglobulin superfamily. The encoded protein consists of four N-terminal immunoglobulin-like domains, six fibronectin type III domains, a transmembrane domain and a C-terminal internal domain that shares homology with the tumor suppressor candidate gene DCC. This protein may be involved in cell growth and differentiation and in cell-cell adhesion. Defects in this gene are associated with cell proliferation in certain cancers. Alternate splicing results in multiple transcript variants. | NEO1 | ENSG00000067141 | NA |
| RAB25, member RAS oncogene family | 57111 | The protein encoded by this gene is a member of the RAS superfamily of small GTPases. The encoded protein is involved in membrane trafficking and cell survival. This gene has been found to be a tumor suppressor and an oncogene, depending on the context. Two variants, one protein-coding and the other not, have been found for this gene. | RAB25 | ENSG00000132698 | NA |
| lipase F, gastric type | 8513 | This gene encodes gastric lipase, an enzyme involved in the digestion of dietary triglycerides in the gastrointestinal tract, and responsible for 30% of fat digestion processes occurring in human. It is secreted by gastric chief cells in the fundic mucosa of the stomach, and it hydrolyzes the ester bonds of triglycerides under acidic pH conditions. The gene is a member of a conserved gene family of lipases that play distinct roles in neutral lipid metabolism. Several transcript variants encoding different isoforms have been found for this gene. | LIPF | ENSG00000182333 | NA |
| atlastin GTPase 1 | 51062 | The protein encoded by this gene is a GTPase and a Golgi body transmembrane protein. The encoded protein can form a homotetramer and has been shown to interact with spastin and with mitogen-activated protein kinase kinase kinase kinase 4. This protein may be involved in axonal maintenance as evidenced by the fact that defects in this gene are a cause of spastic paraplegia type 3. Three transcript variants encoding two different isoforms have been found for this gene. | ATL1 | ENSG00000198513 | NA |
| C-terminal binding protein 2 | 1488 | This gene produces alternative transcripts encoding two distinct proteins. One protein is a transcriptional repressor, while the other isoform is a major component of specialized synapses known as synaptic ribbons. Both proteins contain a NAD+ binding domain similar to NAD+-dependent 2-hydroxyacid dehydrogenases. A portion of the 3’ untranslated region was used to map this gene to chromosome 21q21.3; however, it was noted that similar loci elsewhere in the genome are likely. Blast analysis shows that this gene is present on chromosome 10. Several transcript variants encoding two different isoforms have been found for this gene. | CTBP2 | ENSG00000175029 | NA |
| acyl-CoA synthetase long-chain family member 1 | 2180 | The protein encoded by this gene is an isozyme of the long-chain fatty-acid-coenzyme A ligase family. Although differing in substrate specificity, subcellular localization, and tissue distribution, all isozymes of this family convert free long-chain fatty acids into fatty acyl-CoA esters, and thereby play a key role in lipid biosynthesis and fatty acid degradation. Several transcript variants encoding different isoforms have been found for this gene. | ACSL1 | ENSG00000151726 | NA |
| NA | ENSG00000261054 | NA | RP11-6O2.4 | ENSG00000261054 | NA |
| carbohydrate (N-acetylgalactosamine 4-sulfate 6-O) sulfotransferase 15 | 51363 | Chondroitin sulfate (CS) is a glycosaminoglycan which is an important structural component of the extracellular matrix and which links to proteins to form proteoglycans. Chondroitin sulfate E (CS-E) is an isomer of chondroitin sulfate in which the C-4 and C-6 hydroxyl groups are sulfated. This gene encodes a type II transmembrane glycoprotein that acts as a sulfotransferase to transfer sulfate to the C-6 hydroxal group of chondroitin sulfate. This gene has also been identified as being co-expressed with RAG1 in B-cells and as potentially acting as a B-cell surface signaling receptor. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | CHST15 | ENSG00000182022 | NA |
| syntaxin 11 | 8676 | This gene encodes a member of the syntaxin family. Syntaxins have been implicated in the targeting and fusion of intracellular transport vesicles. This family member may regulate protein transport among late endosomes and the trans-Golgi network. Mutations in this gene have been associated with familial hemophagocytic lymphohistiocytosis. | STX11 | ENSG00000135604 | NA |
| natriuretic peptide A | 4878 | The protein encoded by this gene belongs to the natriuretic peptide family. Natriuretic peptides are implicated in the control of extracellular fluid volume and electrolyte homeostasis. This protein is synthesized as a large precursor (containing a signal peptide), which is processed to release a peptide from the N-terminus with similarity to vasoactive peptide, cardiodilatin, and another peptide from the C-terminus with natriuretic-diuretic activity. Mutations in this gene have been associated with atrial fibrillation familial type 6. This gene is located adjacent to another member of the natriuretic family of peptides on chromosome 1. | NPPA | ENSG00000175206 | NA |
| PDZ domain containing ring finger 3 | 23024 | This gene encodes a member of the LNX (Ligand of Numb Protein-X) family of RING-type ubiquitin E3 ligases. This protein may function in vascular morphogenesis and the differentiation of adipocytes, osteoblasts and myoblasts. This protein may be targeted for degradation by the human papilloma virus E6 protein. Alternative splicing results in multiple transcript variants. | PDZRN3 | ENSG00000121440 | NA |
| pancreatic lipase related protein 1 | 5407 | NA | PNLIPRP1 | ENSG00000187021 | NA |
| polypeptide N-acetylgalactosaminyltransferase 12 | 79695 | This gene encodes a member of a family of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferases, which catalyze the transfer of N-acetylgalactosamine (GalNAc) from UDP-GalNAc to a serine or threonine residue on a polypeptide acceptor in the initial step of O-linked protein glycosylation. Mutations in this gene are associated with an increased susceptibility to colorectal cancer. | GALNT12 | ENSG00000119514 | NA |
| SCOC antisense RNA 1 | 100129858 | NA | SCOC-AS1 | ENSG00000196951 | NA |
| nucleoredoxin | 64359 | This gene encodes a member of the thioredoxin superfamily, a group of small, multifunctional redox-active proteins. Members of this family are characterized by a conserved active motif called the thioredoxin fold that catalyzes disulfide bond formation and isomerization. The encoded protein acts a redox-dependent regulator of the Wnt signaling pathway and is involved in cell growth and differentiation. | NXN | ENSG00000167693 | NA |
| apolipoprotein C1 | 341 | This gene encodes a member of the apolipoprotein C1 family. This gene is expressed primarily in the liver, and it is activated when monocytes differentiate into macrophages. The encoded protein plays a central role in high density lipoprotein (HDL) and very low density lipoprotein (VLDL) metabolism. This protein has also been shown to inhibit cholesteryl ester transfer protein in plasma. A pseudogene of this gene is located 4 kb downstream in the same orientation, on the same chromosome. This gene is mapped to chromosome 19, where it resides within a apolipoprotein gene cluster. | APOC1 | ENSG00000130208 | NA |
| importin 7 pseudogene 2 | ENSG00000225674 | NA | IPO7P2 | ENSG00000225674 | NA |
| immunoglobulin lambda constant 1 (Mcg marker) | ENSG00000211675 | NA | IGLC1 | ENSG00000211675 | NA |
| cell death inducing DFFA like effector c | 63924 | This gene encodes a member of the cell death-inducing DNA fragmentation factor-like effector family. Members of this family play important roles in apoptosis. The encoded protein promotes lipid droplet formation in adipocytes and may mediate adipocyte apoptosis. This gene is regulated by insulin and its expression is positively correlated with insulin sensitivity. Mutations in this gene may contribute to insulin resistant diabetes. A pseudogene of this gene is located on the short arm of chromosome 3. Alternatively spliced transcript variants that encode different isoforms have been observed for this gene. | CIDEC | ENSG00000187288 | NA |
| lipase E, hormone sensitive type | 3991 | The protein encoded by this gene has a long and a short form, generated by use of alternative translational start codons. The long form is expressed in steroidogenic tissues such as testis, where it converts cholesteryl esters to free cholesterol for steroid hormone production. The short form is expressed in adipose tissue, among others, where it hydrolyzes stored triglycerides to free fatty acids. | LIPE | ENSG00000079435 | NA |
| olfactomedin 4 | 10562 | This gene was originally cloned from human myeloblasts and found to be selectively expressed in inflammed colonic epithelium. This gene encodes a member of the olfactomedin family. The encoded protein is an antiapoptotic factor that promotes tumor growth and is an extracellular matrix glycoprotein that facilitates cell adhesion. | OLFM4 | ENSG00000102837 | NA |
| cytochrome P450 family 2 subfamily J member 2 | 1573 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum and is thought to be the predominant enzyme responsible for epoxidation of endogenous arachidonic acid in cardiac tissue. Multiple transcript variants have been found for this gene. | CYP2J2 | ENSG00000134716 | NA |
| membrane palmitoylated protein 7 | 143098 | The protein encoded by this gene is a member of the p55 Stardust family of membrane-associated guanylate kinase (MAGUK) proteins, which function in the establishment of epithelial cell polarity. This family member forms a complex with the polarity protein DLG1 (discs, large homolog 1) and facilitates epithelial cell polarity and tight junction formation. Polymorphisms in this gene are associated with variations in site-specific bone mineral density (BMD). Alternative splicing results in multiple transcript variants. | MPP7 | ENSG00000150054 | NA |
| NA | ENSG00000254680 | NA | RP11-265D17.2 | ENSG00000254680 | NA |
| plastin 1 | 5357 | Plastins are a family of actin-binding proteins that are conserved throughout eukaryote evolution and expressed in most tissues of higher eukaryotes. In humans, two ubiquitous plastin isoforms (L and T) have been identified. The protein encoded by this gene is a third distinct plastin isoform, which is specifically expressed at high levels in the small intestine. Alternatively spliced transcript variants varying in the 5’ UTR, but encoding the same protein, have been found for this gene. A pseudogene of this gene is found on chromosome 11. | PLS1 | ENSG00000120756 | NA |
| NA | NA | NA | NA | ENSG00000250606 | TRUE |
| ras homolog family member U | 58480 | This gene encodes a member of the Rho family of GTPases. This protein can activate PAK1 and JNK1, and can induce filopodium formation and stress fiber dissolution. It may also mediate the effects of WNT1 signaling in the regulation of cell morphology, cytoskeletal organization, and cell proliferation. A non-coding transcript variant of this gene results from naturally occurring read-through transcription between this locus and the neighboring DUSP5P (dual specificity phosphatase 5 pseudogene) locus. | RHOU | ENSG00000116574 | NA |
| C1q and tumor necrosis factor related protein 3 | 114899 | NA | C1QTNF3 | ENSG00000082196 | NA |
| solute carrier family 22 member 17 | 51310 | NA | SLC22A17 | ENSG00000092096 | NA |
| KIAA1522 | 57648 | NA | KIAA1522 | ENSG00000162522 | NA |
| NA | ENSG00000263065 | NA | AF001548.6 | ENSG00000263065 | NA |
| NA | ENSG00000261240 | NA | RP11-304L19.4 | ENSG00000261240 | NA |
| transmembrane protein 54 | 113452 | NA | TMEM54 | ENSG00000121900 | NA |
| sine oculis binding protein homolog | 55084 | The protein encoded by this gene is a nuclear zinc finger protein that is involved in development of the cochlea. Defects in this gene have also been linked to intellectual disability. | SOBP | ENSG00000112320 | NA |
| sphingosine-1-phosphate receptor 1 | 1901 | The protein encoded by this gene is structurally similar to G protein-coupled receptors and is highly expressed in endothelial cells. It binds the ligand sphingosine-1-phosphate with high affinity and high specificity, and suggested to be involved in the processes that regulate the differentiation of endothelial cells. Activation of this receptor induces cell-cell adhesion. Alternative splicing results in multiple transcript variants. | S1PR1 | ENSG00000170989 | NA |
| S100 calcium binding protein B | 6285 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21; however, this gene is located at 21q22.3. This protein may function in Neurite extension, proliferation of melanoma cells, stimulation of Ca2+ fluxes, inhibition of PKC-mediated phosphorylation, astrocytosis and axonal proliferation, and inhibition of microtubule assembly. Chromosomal rearrangements and altered expression of this gene have been implicated in several neurological, neoplastic, and other types of diseases, including Alzheimer’s disease, Down’s syndrome, epilepsy, amyotrophic lateral sclerosis, melanoma, and type I diabetes. | S100B | ENSG00000160307 | NA |
| zinc finger protein 853 | 54753 | NA | ZNF853 | ENSG00000236609 | NA |
| proline rich transmembrane protein 2 | 112476 | This gene encodes a transmembrane protein containing a proline-rich domain in its N-terminal half. Studies in mice suggest that it is predominantly expressed in brain and spinal cord in embryonic and postnatal stages. Mutations in this gene are associated with episodic kinesigenic dyskinesia-1. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | PRRT2 | ENSG00000167371 | NA |
| NA | NA | NA | NA | ENSG00000225490 | TRUE |
| discoidin domain receptor tyrosine kinase 1 | 780 | Receptor tyrosine kinases play a key role in the communication of cells with their microenvironment. These kinases are involved in the regulation of cell growth, differentiation and metabolism. The protein encoded by this gene belongs to a subfamily of tyrosine kinase receptors with homology to Dictyostelium discoideum protein discoidin I in their extracellular domain, and that are activated by various types of collagen. Expression of this protein is restricted to epithelial cells, particularly in the kidney, lung, gastrointestinal tract, and brain. In addition, it has been shown to be significantly overexpressed in several human tumors. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | DDR1 | ENSG00000204580 | NA |
| nudix hydrolase 8 | 254552 | NA | NUDT8 | ENSG00000167799 | NA |
| NA | ENSG00000223774 | NA | RP11-307B6.3 | ENSG00000223774 | NA |
| BR serine/threonine kinase 1 | 84446 | NA | BRSK1 | ENSG00000160469 | NA |
| dysbindin (dystrobrevin binding protein 1) domain containing 1 | 79007 | NA | DBNDD1 | ENSG00000003249 | NA |
| retinol binding protein 1 | 5947 | This gene encodes the carrier protein involved in the transport of retinol (vitamin A alcohol) from the liver storage site to peripheral tissue. Vitamin A is a fat-soluble vitamin necessary for growth, reproduction, differentiation of epithelial tissues, and vision. Multiple transcript variants encoding different isoforms have been found for this gene. | RBP1 | ENSG00000114115 | NA |
| protease, serine 8 | 5652 | This gene encodes a member of the peptidase S1 or chymotrypsin family of serine proteases. The encoded preproprotein is proteolytically processed to generate light and heavy chains that associate via a disulfide bond to form the heterodimeric enzyme. This enzyme is highly expressed in prostate epithelia and is one of several proteolytic enzymes found in seminal fluid. This protease exhibits trypsin-like substrate specificity, cleaving protein substrates at the carboxyl terminus of lysine or arginine residues. The encoded protease partially mediates proteolytic activation of the epithelial sodium channel, a regulator of sodium balance, and may also play a role in epithelial barrier formation. | PRSS8 | ENSG00000052344 | NA |
| ectonucleotide pyrophosphatase/phosphodiesterase 5 (putative) | 59084 | This gene encodes a type-I transmembrane glycoprotein. Studies in rat suggest the encoded protein may play a role in neuronal cell communications. Alternatively spliced transcript variants have been described. | ENPP5 | ENSG00000112796 | NA |
| acyl-CoA thioesterase 11 | 26027 | This gene encodes a member of the acyl-CoA thioesterase family which catalyse the conversion of activated fatty acids to the corresponding non-esterified fatty acid and coenzyme A. Expression of a mouse homolog in brown adipose tissue is induced by low temperatures and repressed by warm temperatures. Higher levels of expression of the mouse homolog has been found in obesity-resistant mice compared with obesity-prone mice, suggesting a role of acyl-CoA thioesterase 11 in obesity. Alternative splicing results in transcript variants. | ACOT11 | ENSG00000162390 | NA |
| dishevelled segment polarity protein 1 | 1855 | DVL1, the human homolog of the Drosophila dishevelled gene (dsh) encodes a cytoplasmic phosphoprotein that regulates cell proliferation, acting as a transducer molecule for developmental processes, including segmentation and neuroblast specification. DVL1 is a candidate gene for neuroblastomatous transformation. The Schwartz-Jampel syndrome and Charcot-Marie-Tooth disease type 2A have been mapped to the same region as DVL1. The phenotypes of these diseases may be consistent with defects which might be expected from aberrant expression of a DVL gene during development. | DVL1 | ENSG00000107404 | NA |
| intermediate filament family orphan 2 | 126917 | NA | IFFO2 | ENSG00000169991 | NA |
| ATPase phospholipid transporting 9A (putative) | 10079 | NA | ATP9A | ENSG00000054793 | NA |
| myozenin 2 | 51778 | The protein encoded by this gene belongs to a family of sarcomeric proteins that bind to calcineurin, a phosphatase involved in calcium-dependent signal transduction in diverse cell types. These family members tether calcineurin to alpha-actinin at the z-line of the sarcomere of cardiac and skeletal muscle cells, and thus they are important for calcineurin signaling. Mutations in this gene cause cardiomyopathy familial hypertrophic type 16, a hereditary heart disorder. | MYOZ2 | ENSG00000172399 | NA |
| G protein subunit alpha z | 2781 | The protein encoded by this gene is a member of a G protein subfamily that mediates signal transduction in pertussis toxin-insensitive systms. This encoded protein may play a role in maintaining the ionic balance of perilymphatic and endolymphatic cochlear fluids. | GNAZ | ENSG00000128266 | NA |
| beta-2-microglobulin | 567 | This gene encodes a serum protein found in association with the major histocompatibility complex (MHC) class I heavy chain on the surface of nearly all nucleated cells. The protein has a predominantly beta-pleated sheet structure that can form amyloid fibrils in some pathological conditions. The encoded antimicrobial protein displays antibacterial activity in amniotic fluid. A mutation in this gene has been shown to result in hypercatabolic hypoproteinemia. | B2M | ENSG00000166710 | NA |
| NA | ENSG00000247134 | NA | RP11-11N9.4 | ENSG00000247134 | NA |
| cullin associated and neddylation dissociated 2 (putative) | 23066 | NA | CAND2 | ENSG00000144712 | NA |
| inositol-trisphosphate 3-kinase A | 3706 | Regulates inositol phosphate metabolism by phosphorylation of second messenger inositol 1,4,5-trisphosphate to Ins(1,3,4,5)P4. The activity of the inositol 1,4,5-trisphosphate 3-kinase is responsible for regulating the levels of a large number of inositol polyphosphates that are important in cellular signaling. Both calcium/calmodulin and protein phosphorylation mechanisms control its activity. It is also a substrate for the cyclic AMP-dependent protein kinase, calcium/calmodulin- dependent protein kinase II, and protein kinase C in vitro. | ITPKA | ENSG00000137825 | NA |
| carbonic anhydrase 9 | 768 | Carbonic anhydrases (CAs) are a large family of zinc metalloenzymes that catalyze the reversible hydration of carbon dioxide. They participate in a variety of biological processes, including respiration, calcification, acid-base balance, bone resorption, and the formation of aqueous humor, cerebrospinal fluid, saliva, and gastric acid. They show extensive diversity in tissue distribution and in their subcellular localization. CA IX is a transmembrane protein and is one of only two tumor-associated carbonic anhydrase isoenzymes known. It is expressed in all clear-cell renal cell carcinoma, but is not detected in normal kidney or most other normal tissues. It may be involved in cell proliferation and transformation. This gene was mapped to 17q21.2 by fluorescence in situ hybridization, however, radiation hybrid mapping localized it to 9p13-p12. | CA9 | ENSG00000107159 | NA |
| NA | ENSG00000229212 | NA | RP11-561C5.4 | ENSG00000229212 | NA |
| NA | ENSG00000259684 | NA | RP11-120K9.2 | ENSG00000259684 | NA |
| microtubule associated monooxygenase, calponin and LIM domain containing 2 | 9645 | NA | MICAL2 | ENSG00000133816 | NA |
| p21 (RAC1) activated kinase 1 | 5058 | This gene encodes a family member of serine/threonine p21-activating kinases, known as PAK proteins. These proteins are critical effectors that link RhoGTPases to cytoskeleton reorganization and nuclear signaling, and they serve as targets for the small GTP binding proteins Cdc42 and Rac. This specific family member regulates cell motility and morphology. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | PAK1 | ENSG00000149269 | NA |
| naked cuticle homolog 1 | 85407 | In the mouse, Nkd is a Dishevelled (see DVL1; MIM 601365)-binding protein that functions as a negative regulator of the Wnt (see WNT1; MIM 164820)-beta-catenin (see MIM 116806)-Tcf (see MIM 602272) signaling pathway. | NKD1 | ENSG00000140807 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",9,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[10,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
kable(as.data.frame(out))
| X_id | summary | name | symbol | query |
|---|---|---|---|---|
| 56265 | This gene likely encodes a member of the carboxypeptidase family of proteins. Cloning of a comparable locus in mouse indicates that the encoded protein contains a discoidin domain and a carboxypeptidase domain, but the protein appears to lack residues necessary for carboxypeptidase activity. | carboxypeptidase X (M14 family), member 1 | CPXM1 | ENSG00000088882 |
| ENSG00000263065 | NA | NA | AF001548.6 | ENSG00000263065 |
| ENSG00000263335 | NA | NA | AF001548.5 | ENSG00000263335 |
| 81610 | NA | family with sequence similarity 83 member D | FAM83D | ENSG00000101447 |
| 1674 | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | desmin | DES | ENSG00000175084 |
| 9472 | The A-kinase anchor proteins (AKAPs) are a group of structurally diverse proteins, which have the common function of binding to the regulatory subunit of protein kinase A (PKA) and confining the holoenzyme to discrete locations within the cell. This gene encodes a member of the AKAP family. The encoded protein is highly expressed in various brain regions and cardiac and skeletal muscle. It is specifically localized to the sarcoplasmic reticulum and nuclear membrane, and is involved in anchoring PKA to the nuclear membrane or sarcoplasmic reticulum. | A-kinase anchoring protein 6 | AKAP6 | ENSG00000151320 |
| ENSG00000234638 | NA | NA | AC053503.6 | ENSG00000234638 |
| 1264 | NA | calponin 1 | CNN1 | ENSG00000130176 |
| 4629 | The protein encoded by this gene is a smooth muscle myosin belonging to the myosin heavy chain family. The gene product is a subunit of a hexameric protein that consists of two heavy chain subunits and two pairs of non-identical light chain subunits. It functions as a major contractile protein, converting chemical energy into mechanical energy through the hydrolysis of ATP. The gene encoding a human ortholog of rat NUDE1 is transcribed from the reverse strand of this gene, and its 3’ end overlaps with that of the latter. The pericentric inversion of chromosome 16 [inv(16)(p13q22)] produces a chimeric transcript that encodes a protein consisting of the first 165 residues from the N terminus of core-binding factor beta in a fusion with the C-terminal portion of the smooth muscle myosin heavy chain. This chromosomal rearrangement is associated with acute myeloid leukemia of the M4Eo subtype. Alternative splicing generates isoforms that are differentially expressed, with ratios changing during muscle cell maturation. Alternatively spliced transcript variants encoding different isoforms have been identified. | myosin, heavy chain 11, smooth muscle | MYH11 | ENSG00000133392 |
| 51676 | This gene encodes a member of the ankyrin repeat and SOCS box-containing (ASB) protein family. These proteins play a role in protein degradation by coupling suppressor of cytokine signalling (SOCS) proteins with the elongin BC complex. The encoded protein is a subunit of a multimeric E3 ubiquitin ligase complex that mediates the degradation of actin-binding proteins. This gene plays a role in retinoic acid-induced growth inhibition and differentiation of myeloid leukemia cells. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | ankyrin repeat and SOCS box containing 2 | ASB2 | ENSG00000100628 |
| 9423 | Netrin is included in a family of laminin-related secreted proteins. The function of this gene has not yet been defined; however, netrin is thought to be involved in axon guidance and cell migration during development. Mutations and loss of expression of netrin suggest that variation in netrin may be involved in cancer development. | netrin 1 | NTN1 | ENSG00000065320 |
| 1000 | This gene encodes a classical cadherin and member of the cadherin superfamily. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein is proteolytically processed to generate a calcium-dependent cell adhesion molecule and glycoprotein. This protein plays a role in the establishment of left-right asymmetry, development of the nervous system and the formation of cartilage and bone. | cadherin 2 | CDH2 | ENSG00000170558 |
| 4232 | This gene encodes a member of the alpha/beta hydrolase superfamily. It is imprinted, exhibiting preferential expression from the paternal allele in fetal tissues, and isoform-specific imprinting in lymphocytes. The loss of imprinting of this gene has been linked to certain types of cancer and may be due to promotor switching. The encoded protein may play a role in development. Alternatively spliced transcript variants encoding multiple isoforms have been identified for this gene. Pseudogenes of this gene are located on the short arm of chromosomes 3 and 4, and the long arm of chromosomes 6 and 15. | mesoderm specific transcript | MEST | ENSG00000106484 |
| 104326055 | NA | APOA1 antisense RNA | APOA1-AS | ENSG00000235910 |
| 22943 | This gene encodes a protein that is a member of the dickkopf family. It is a secreted protein with two cysteine rich regions and is involved in embryonic development through its inhibition of the WNT signaling pathway. Elevated levels of DKK1 in bone marrow plasma and peripheral blood is associated with the presence of osteolytic bone lesions in patients with multiple myeloma. | dickkopf WNT signaling pathway inhibitor 1 | DKK1 | ENSG00000107984 |
| 335 | This gene encodes apolipoprotein A-I, which is the major protein component of high density lipoprotein (HDL) in plasma. The encoded preproprotein is proteolytically processed to generate the mature protein, which promotes cholesterol efflux from tissues to the liver for excretion, and is a cofactor for lecithin cholesterolacyltransferase (LCAT), an enzyme responsible for the formation of most plasma cholesteryl esters. This gene is closely linked with two other apolipoprotein genes on chromosome 11. Defects in this gene are associated with HDL deficiencies, including Tangier disease, and with systemic non-neuropathic amyloidosis. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein. | apolipoprotein A1 | APOA1 | ENSG00000118137 |
| 57124 | NA | CD248 molecule | CD248 | ENSG00000174807 |
| 100885848 | NA | prostaglandin E synthase 3 (cytosolic)-like | PTGES3L | ENSG00000267060 |
| 147906 | NA | dishevelled binding antagonist of beta catenin 3 | DACT3 | ENSG00000197380 |
| ENSG00000261054 | NA | NA | RP11-6O2.4 | ENSG00000261054 |
| 4897 | Cell adhesion molecules (CAMs) are members of the immunoglobulin superfamily. This gene encodes a neuronal cell adhesion molecule with multiple immunoglobulin-like C2-type domains and fibronectin type-III domains. This ankyrin-binding protein is involved in neuron-neuron adhesion and promotes directional signaling during axonal cone growth. This gene is also expressed in non-neural tissues and may play a general role in cell-cell communication via signaling from its intracellular domain to the actin cytoskeleton during directional cell migration. Allelic variants of this gene have been associated with autism and addiction vulnerability. Alternative splicing results in multiple transcript variants encoding different isoforms. | neuronal cell adhesion molecule | NRCAM | ENSG00000091129 |
| 5346 | The protein encoded by this gene coats lipid storage droplets in adipocytes, thereby protecting them until they can be broken down by hormone-sensitive lipase. The encoded protein is the major cAMP-dependent protein kinase substrate in adipocytes and, when unphosphorylated, may play a role in the inhibition of lipolysis. Alternatively spliced transcript variants varying in the 5’ UTR, but encoding the same protein, have been found for this gene. | perilipin 1 | PLIN1 | ENSG00000166819 |
| 117178 | This gene encodes a protein that binds the cancer-testis antigen Synovial Sarcoma X breakpoint 2 protein. The encoded protein may regulate the activity of Synovial Sarcoma X breakpoint 2 protein in malignant cells. Alternate splicing results in multiple transcript variants. A pseudogene of this gene is found on chromosome 3. | SSX family member 2 interacting protein | SSX2IP | ENSG00000117155 |
| 6591 | This gene encodes a member of the Snail family of C2H2-type zinc finger transcription factors. The encoded protein acts as a transcriptional repressor that binds to E-box motifs and is also likely to repress E-cadherin transcription in breast carcinoma. This protein is involved in epithelial-mesenchymal transitions and has antiapoptotic activity. Mutations in this gene may be associated with sporatic cases of neural tube defects. | snail family transcriptional repressor 2 | SNAI2 | ENSG00000019549 |
| 375061 | NA | family with sequence similarity 89 member A | FAM89A | ENSG00000182118 |
| 8736 | The giant protein titin, together with its associated proteins, interconnects the major structure of sarcomeres, the M bands and Z discs. The C-terminal end of the titin string extends into the M line, where it binds tightly to M-band constituents of apparent molecular masses of 190 kD (myomesin 1) and 165 kD (myomesin 2). This protein, myomesin 1, like myomesin 2, titin, and other myofibrillar proteins contains structural modules with strong homology to either fibronectin type III (motif I) or immunoglobulin C2 (motif II) domains. Myomesin 1 and myomesin 2 each have a unique N-terminal region followed by 12 modules of motif I or motif II, in the arrangement II-II-I-I-I-I-I-II-II-II-II-II. The two proteins share 50% sequence identity in this repeat-containing region. The head structure formed by these 2 proteins on one end of the titin string extends into the center of the M band. The integrating structure of the sarcomere arises from muscle-specific members of the superfamily of immunoglobulin-like proteins. Alternatively spliced transcript variants encoding different isoforms have been identified. | myomesin 1 | MYOM1 | ENSG00000101605 |
| 5881 | The protein encoded by this gene is a GTPase which belongs to the RAS superfamily of small GTP-binding proteins. Members of this superfamily appear to regulate a diverse array of cellular events, including the control of cell growth, cytoskeletal reorganization, and the activation of protein kinases. Alternative splicing results in multiple transcript variants. | ras-related C3 botulinum toxin substrate 3 (rho family, small GTP binding protein Rac3) | RAC3 | ENSG00000169750 |
| ENSG00000231346 | NA | long intergenic non-protein coding RNA 1160 | LINC01160 | ENSG00000231346 |
| 100506826 | NA | MYLK antisense RNA 1 | MYLK-AS1 | ENSG00000239523 |
| ENSG00000254756 | NA | NA | RP11-867G23.12 | ENSG00000254756 |
| 348093 | NA | RNA binding protein with multiple splicing 2 | RBPMS2 | ENSG00000166831 |
| 72 | Actins are highly conserved proteins that are involved in various types of cell motility and in the maintenance of the cytoskeleton. Three types of actins, alpha, beta and gamma, have been identified in vertebrates. Alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton and as mediators of internal cell motility. This gene encodes actin gamma 2; a smooth muscle actin found in enteric tissues. Alternative splicing results in multiple transcript variants encoding distinct isoforms. Based on similarity to peptide cleavage of related actins, the mature protein of this gene is formed by removal of two N-terminal peptides. | actin, gamma 2, smooth muscle, enteric | ACTG2 | ENSG00000163017 |
| 4828 | This gene encodes a member of the bombesin-like family of neuropeptides, which negatively regulate eating behavior. The encoded protein may regulate colonic smooth muscle contraction through binding to its cognate receptor, the neuromedin B receptor (NMBR). Polymorphisms of this gene may be associated with hunger, weight gain and obesity. Alternative splicing results in multiple transcript variants. | neuromedin B | NMB | ENSG00000197696 |
| 50486 | NA | G0/G1 switch 2 | G0S2 | ENSG00000123689 |
| 26136 | Cancer-associated chromosomal changes often involve regions containing fragile sites. This gene maps to a commom fragile site on chromosome 7q31.2 designated FRA7G. This gene is similar to mouse Testin, a testosterone-responsive gene encoding a Sertoli cell secretory protein containing three LIM domains. LIM domains are double zinc-finger motifs that mediate protein-protein interactions between transcription factors, cytoskeletal proteins and signaling proteins. This protein is a negative regulator of cell growth and may act as a tumor suppressor. This scaffold protein may also play a role in cell adhesion, cell spreading and in the reorganization of the actin cytoskeleton. Multiple protein isoforms are encoded by transcript variants of this gene. | testin LIM domain protein | TES | ENSG00000135269 |
| 2167 | FABP4 encodes the fatty acid binding protein found in adipocytes. Fatty acid binding proteins are a family of small, highly conserved, cytoplasmic proteins that bind long-chain fatty acids and other hydrophobic ligands. It is thought that FABPs roles include fatty acid uptake, transport, and metabolism. | fatty acid binding protein 4 | FABP4 | ENSG00000170323 |
| 6288 | This gene encodes a member of the serum amyloid A family of apolipoproteins. The encoded preproprotein is proteolytically processed to generate the mature protein. This protein is a major acute phase protein that is highly expressed in response to inflammation and tissue injury. This protein also plays an important role in HDL metabolism and cholesterol homeostasis. High levels of this protein are associated with chronic inflammatory diseases including atherosclerosis, rheumatoid arthritis, Alzheimer’s disease and Crohn’s disease. This protein may also be a potential biomarker for certain tumors. Alternate splicing results in multiple transcript variants that encode the same protein. A pseudogene of this gene is found on chromosome 11. | serum amyloid A1 | SAA1 | ENSG00000173432 |
| 27296 | NA | TP53 target 5 | TP53TG5 | ENSG00000124251 |
| 1601 | This gene encodes a mitogen-responsive phosphoprotein. It is expressed in normal ovarian epithelial cells, but is down-regulated or absent from ovarian carcinoma cell lines, suggesting its role as a tumor suppressor. This protein binds to the SH3 domains of GRB2, an adaptor protein that couples tyrosine kinase receptors to SOS (a guanine nucleotide exchange factor for Ras), via its C-terminal proline-rich sequences, and may thus modulate growth factor/Ras pathways by competing with SOS for binding to GRB2. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | DAB2, clathrin adaptor protein | DAB2 | ENSG00000153071 |
| 100127983 | NA | chromosome 8 open reading frame 88 | C8orf88 | ENSG00000253250 |
| 9610 | NA | Ras and Rab interactor 1 | RIN1 | ENSG00000174791 |
| 58476 | NA | tumor protein p53 inducible nuclear protein 2 | TP53INP2 | ENSG00000078804 |
| 4638 | This gene, a muscle member of the immunoglobulin gene superfamily, encodes myosin light chain kinase which is a calcium/calmodulin dependent enzyme. This kinase phosphorylates myosin regulatory light chains to facilitate myosin interaction with actin filaments to produce contractile activity. This gene encodes both smooth muscle and nonmuscle isoforms. In addition, using a separate promoter in an intron in the 3’ region, it encodes telokin, a small protein identical in sequence to the C-terminus of myosin light chain kinase, that is independently expressed in smooth muscle and functions to stabilize unphosphorylated myosin filaments. A pseudogene is located on the p arm of chromosome 3. Four transcript variants that produce four isoforms of the calcium/calmodulin dependent enzyme have been identified as well as two transcripts that produce two isoforms of telokin. Additional variants have been identified but lack full length transcripts. | myosin light chain kinase | MYLK | ENSG00000065534 |
| 101929595 | NA | uncharacterized LOC101929595 | LOC101929595 | ENSG00000245293 |
| 1759 | This gene encodes a member of the dynamin subfamily of GTP-binding proteins. The encoded protein possesses unique mechanochemical properties used to tubulate and sever membranes, and is involved in clathrin-mediated endocytosis and other vesicular trafficking processes. Actin and other cytoskeletal proteins act as binding partners for the encoded protein, which can also self-assemble leading to stimulation of GTPase activity. More than sixty highly conserved copies of the 3’ region of this gene are found elsewhere in the genome, particularly on chromosomes Y and 15. Alternatively spliced transcript variants encoding different isoforms have been described. | dynamin 1 | DNM1 | ENSG00000106976 |
| 6324 | Voltage-gated sodium channels are heteromeric proteins that function in the generation and propagation of action potentials in muscle and neuronal cells. They are composed of one alpha and two beta subunits, where the alpha subunit provides channel activity and the beta-1 subunit modulates the kinetics of channel inactivation. This gene encodes a sodium channel beta-1 subunit. Mutations in this gene result in generalized epilepsy with febrile seizures plus, Brugada syndrome 5, and defects in cardiac conduction. Multiple transcript variants encoding different isoforms have been found for this gene. | sodium voltage-gated channel beta subunit 1 | SCN1B | ENSG00000105711 |
| 10544 | The protein encoded by this gene is a receptor for activated protein C, a serine protease activated by and involved in the blood coagulation pathway. The encoded protein is an N-glycosylated type I membrane protein that enhances the activation of protein C. Mutations in this gene have been associated with venous thromboembolism and myocardial infarction, as well as with late fetal loss during pregnancy. The encoded protein may also play a role in malarial infection and has been associated with cancer. | protein C receptor | PROCR | ENSG00000101000 |
| 7871 | This gene encodes a component of a conserved striatin-interacting phosphatase and kinase complex. Striatin family complexes participate in a variety of cellular processes including signaling, cell cycle control, cell migration, Golgi assembly, and apoptosis. The protein encoded by this gene is a coiled-coil, tail-anchored membrane protein with a single C-terminal transmembrane domain that is posttranslationally inserted into membranes. Mutations in this gene are associated with Brugada syndrome, a cardiac channelopathy. Alternative splicing results in multiple transcript variants. | sarcolemma associated protein | SLMAP | ENSG00000163681 |
| 23043 | Germinal center kinases (GCKs), such as TNIK, are characterized by an N-terminal kinase domain and a C-terminal GCK domain that serves a regulatory function (Fu et al., 1999 [PubMed 10521462]). | TRAF2 and NCK interacting kinase | TNIK | ENSG00000154310 |
| 101930114 | NA | uncharacterized LOC101930114 | LOC101930114 | ENSG00000227591 |
| 6623 | This gene encodes a member of the synuclein family of proteins which are believed to be involved in the pathogenesis of neurodegenerative diseases. Mutations in this gene have also been associated with breast tumor development. | synuclein gamma | SNCG | ENSG00000173267 |
| 79870 | This gene was identified by gene expression studies in patients with acute myeloid leukemia (AML). The gene is conserved among mammals and is not found in lower organisms. Tissues that express this gene develop from the neuroectoderm. Multiple alternatively spliced transcript variants that encode different proteins have been described for this gene; however, some of the transcript variants are found only in AML cell lines. | brain and acute leukemia, cytoplasmic | BAALC | ENSG00000164929 |
| 283807 | This gene encodes a member of the F-box protein family. This F-box protein interacts with S-phase kinase-associated protein 1A and cullin in order to form SCF complexes which function as ubiquitin ligases. | F-box and leucine rich repeat protein 22 | FBXL22 | ENSG00000197361 |
| 2901 | This gene encodes a protein that belongs to the glutamate-gated ionic channel family. Glutamate functions as the major excitatory neurotransmitter in the central nervous system through activation of ligand-gated ion channels and G protein-coupled membrane receptors. The protein encoded by this gene forms functional heteromeric kainate-preferring ionic channels with the subunits encoded by related gene family members. Alternative splicing results in multiple transcript variants. | glutamate ionotropic receptor kainate type subunit 5 | GRIK5 | ENSG00000105737 |
| 23336 | The protein encoded by this gene is an intermediate filament (IF) family member. IF proteins are cytoskeletal proteins that confer resistance to mechanical stress and are encoded by a dispersed multigene family. This protein has been found to form a linkage between desmin, which is a subunit of the IF network, and the extracellular matrix, and provides an important structural support in muscle. Two alternatively spliced variants encoding different isoforms have been described for this gene. | synemin | SYNM | ENSG00000182253 |
| 7079 | This gene belongs to the TIMP gene family. The proteins encoded by this gene family are inhibitors of the matrix metalloproteinases, a group of peptidases involved in degradation of the extracellular matrix. The secreted, netrin domain-containing protein encoded by this gene is involved in regulation of platelet aggregation and recruitment and may play role in hormonal regulation and endometrial tissue remodeling. | TIMP metallopeptidase inhibitor 4 | TIMP4 | ENSG00000157150 |
| 646023 | NA | ADORA2A antisense RNA 1 | ADORA2A-AS1 | ENSG00000178803 |
| 100874032 | NA | PRRT3 antisense RNA 1 | PRRT3-AS1 | ENSG00000230082 |
| 1513 | The protein encoded by this gene is a lysosomal cysteine proteinase involved in bone remodeling and resorption. This protein, which is a member of the peptidase C1 protein family, is predominantly expressed in osteoclasts. However, the encoded protein is also expressed in a significant fraction of human breast cancers, where it could contribute to tumor invasiveness. Mutations in this gene are the cause of pycnodysostosis, an autosomal recessive disease characterized by osteosclerosis and short stature. | cathepsin K | CTSK | ENSG00000143387 |
| 51559 | NA | 5’-nucleotidase domain containing 3 | NT5DC3 | ENSG00000111696 |
| 6450 | NA | SH3 domain binding glutamate rich protein | SH3BGR | ENSG00000185437 |
| 4487 | This gene encodes a member of the muscle segment homeobox gene family. The encoded protein functions as a transcriptional repressor during embryogenesis through interactions with components of the core transcription complex and other homeoproteins. It may also have roles in limb-pattern formation, craniofacial development, particularly odontogenesis, and tumor growth inhibition. Mutations in this gene, which was once known as homeobox 7, have been associated with nonsyndromic cleft lip with or without cleft palate 5, Witkop syndrome, Wolf-Hirschom syndrome, and autosomoal dominant hypodontia. | msh homeobox 1 | MSX1 | ENSG00000163132 |
| 10669 | NA | cell growth regulator with EF-hand domain 1 | CGREF1 | ENSG00000138028 |
| 5157 | This gene encodes a protein with significant sequence similarity to the ligand binding domain of platelet-derived growth factor receptor beta. Mutations in this gene, or deletion of a chromosomal segment containing this gene, are associated with sporadic hepatocellular carcinomas, colorectal cancers, and non-small cell lung cancers. This suggests this gene product may function as a tumor suppressor. | platelet derived growth factor receptor like | PDGFRL | ENSG00000104213 |
| 81544 | Glycerophosphodiester phosphodiesterases (GDPDs; EC 3.1.4.46), such as GDPD5, are involved in glycerol metabolism (Lang et al., 2008 [PubMed 17578682]). | glycerophosphodiester phosphodiesterase domain containing 5 | GDPD5 | ENSG00000158555 |
| 55897 | NA | mesoderm posterior bHLH transcription factor 1 | MESP1 | ENSG00000166823 |
| ENSG00000230289 | NA | NA | RP11-334J6.6 | ENSG00000230289 |
| 63924 | This gene encodes a member of the cell death-inducing DNA fragmentation factor-like effector family. Members of this family play important roles in apoptosis. The encoded protein promotes lipid droplet formation in adipocytes and may mediate adipocyte apoptosis. This gene is regulated by insulin and its expression is positively correlated with insulin sensitivity. Mutations in this gene may contribute to insulin resistant diabetes. A pseudogene of this gene is located on the short arm of chromosome 3. Alternatively spliced transcript variants that encode different isoforms have been observed for this gene. | cell death inducing DFFA like effector c | CIDEC | ENSG00000187288 |
| 7301 | The gene is part of a 3-member transmembrane receptor kinase receptor family with a processed pseudogene distal on chromosome 15. The encoded protein is activated by the products of the growth arrest-specific gene 6 and protein S genes and is involved in controlling cell survival and proliferation, spermatogenesis, immunoregulation and phagocytosis. The encoded protein has also been identified as a cell entry factor for Ebola and Marburg viruses. | TYRO3 protein tyrosine kinase | TYRO3 | ENSG00000092445 |
| 2791 | This gene is a member of the guanine nucleotide-binding protein (G protein) gamma family and encodes a lipid-anchored, cell membrane protein. As a member of the heterotrimeric G protein complex, this protein plays a role in this transmembrane signaling system. This protein is also subject to carboxyl-terminal processing. Decreased expression of this gene is associated with splenic marginal zone lymphomas. | G protein subunit gamma 11 | GNG11 | ENSG00000127920 |
| 257177 | NA | cilia and flagella associated protein 126 | CFAP126 | ENSG00000188931 |
| 7106 | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. This encoded protein is a cell surface glycoprotein and is similar in sequence to its family member CD53 antigen. It is known to complex with integrins and other transmembrane 4 superfamily proteins. Alternatively spliced transcript variants encoding different isoforms have been identified. | tetraspanin 4 | TSPAN4 | ENSG00000214063 |
| 3991 | The protein encoded by this gene has a long and a short form, generated by use of alternative translational start codons. The long form is expressed in steroidogenic tissues such as testis, where it converts cholesteryl esters to free cholesterol for steroid hormone production. The short form is expressed in adipose tissue, among others, where it hydrolyzes stored triglycerides to free fatty acids. | lipase E, hormone sensitive type | LIPE | ENSG00000079435 |
| 29109 | This gene encodes a protein which is a member of the formin/diaphanous family of proteins. The gene is ubiquitously expressed but is found in abundance in the spleen. The encoded protein has sequence homology to diaphanous and formin proteins within the Formin Homology (FH)1 and FH2 domains. It also contains a coiled-coil domain, a collagen-like domain, two nuclear localization signals, and several potential PKC and PKA phosphorylation sites. It is a predominantly cytoplasmic protein and is expressed in a variety of human cell lines. Alternative splicing results in multiple transcript variants. | formin homology 2 domain containing 1 | FHOD1 | ENSG00000135723 |
| 115908 | This locus encodes a protein that may play a role in the cellular response to arterial injury through involvement in vascular remodeling. Mutations at this locus have been associated with Barrett esophagus and esophageal adenocarcinoma. Alternatively spliced transcript variants have been described. | collagen triple helix repeat containing 1 | CTHRC1 | ENSG00000164932 |
| 2194 | The enzyme encoded by this gene is a multifunctional protein. Its main function is to catalyze the synthesis of palmitate from acetyl-CoA and malonyl-CoA, in the presence of NADPH, into long-chain saturated fatty acids. In some cancer cell lines, this protein has been found to be fused with estrogen receptor-alpha (ER-alpha), in which the N-terminus of FAS is fused in-frame with the C-terminus of ER-alpha. | fatty acid synthase | FASN | ENSG00000169710 |
| 57449 | This gene encodes a protein that activates the nuclear factor kappa B (NFKB1) signaling pathway. Mutations in this gene are associated with autosomal recessive distal spinal muscular atrophy. Multiple transcript variants encoding different isoforms have been found for this gene. | pleckstrin homology and RhoGEF domain containing G5 | PLEKHG5 | ENSG00000171680 |
| 4660 | Myosin phosphatase is a protein complex comprised of three subunits: a catalytic subunit (PP1c-delta, protein phosphatase 1, catalytic subunit delta), a large regulatory subunit (MYPT, myosin phosphatase target) and small regulatory subunit (sm-M20). Two isoforms of MYPT have been isolated–MYPT1 and MYPT2, the first of which is widely expressed, and the second of which may be specific to heart, skeletal muscle, and brain. Each of the MYPT isoforms functions to bind PP1c-delta and increase phosphatase activity. This locus encodes both MYTP2 and M20. Alternatively spliced transcript variants encoding different isoforms have been identified. Related pseudogenes have been defined on the Y chromosome. | protein phosphatase 1 regulatory subunit 12B | PPP1R12B | ENSG00000077157 |
| 5118 | Fibrillar collagen types I-III are synthesized as precursor molecules known as procollagens. These precursors contain amino- and carboxyl-terminal peptide extensions known as N- and C-propeptides, respectively, which are cleaved, upon secretion of procollagen from the cell, to yield the mature triple helical, highly structured fibrils. This gene encodes a glycoprotein which binds and drives the enzymatic cleavage of type I procollagen and heightens C-proteinase activity. | procollagen C-endopeptidase enhancer | PCOLCE | ENSG00000106333 |
| 3306 | NA | heat shock protein family A (Hsp70) member 2 | HSPA2 | ENSG00000126803 |
| 5350 | The protein encoded by this gene is found as a pentamer and is a major substrate for the cAMP-dependent protein kinase in cardiac muscle. The encoded protein is an inhibitor of cardiac muscle sarcoplasmic reticulum Ca(2+)-ATPase in the unphosphorylated state, but inhibition is relieved upon phosphorylation of the protein. The subsequent activation of the Ca(2+) pump leads to enhanced muscle relaxation rates, thereby contributing to the inotropic response elicited in heart by beta-agonists. The encoded protein is a key regulator of cardiac diastolic function. Mutations in this gene are a cause of inherited human dilated cardiomyopathy with refractory congestive heart failure, and also familial hypertrophic cardiomyopathy. | phospholamban | PLN | ENSG00000198523 |
| 9454 | This gene encodes a member of the HOMER family of postsynaptic density scaffolding proteins that share a similar domain structure consisting of an N-terminal Enabled/vasodilator-stimulated phosphoprotein homology 1 domain which mediates protein-protein interactions, and a carboxy-terminal coiled-coil domain and two leucine zipper motifs that are involved in self-oligomerization. The encoded protein binds numerous other proteins including group I metabotropic glutamate receptors, inositol 1,4,5-trisphosphate receptors and amyloid precursor proteins and has been implicated in diverse biological functions such as neuronal signaling, T-cell activation and trafficking of amyloid beta peptides. Alternative splicing results in multiple transcript variants. | homer scaffolding protein 3 | HOMER3 | ENSG00000051128 |
| 2318 | This gene encodes one of three related filamin genes, specifically gamma filamin. These filamin proteins crosslink actin filaments into orthogonal networks in cortical cytoplasm and participate in the anchoring of membrane proteins for the actin cytoskeleton. Three functional domains exist in filamin: an N-terminal filamentous actin-binding domain, a C-terminal self-association domain, and a membrane glycoprotein-binding domain. Two transcript variants encoding different isoforms have been found for this gene. | filamin C | FLNC | ENSG00000128591 |
| ENSG00000229894 | NA | NA | RP11-668G10.2 | ENSG00000229894 |
| 245711 | NA | speedy/RINGO cell cycle regulator family member A | SPDYA | ENSG00000163806 |
| 10267 | The protein encoded by this gene is a member of the RAMP family of single-transmembrane-domain proteins, called receptor (calcitonin) activity modifying proteins (RAMPs). RAMPs are type I transmembrane proteins with an extracellular N terminus and a cytoplasmic C terminus. RAMPs are required to transport calcitonin-receptor-like receptor (CRLR) to the plasma membrane. CRLR, a receptor with seven transmembrane domains, can function as either a calcitonin-gene-related peptide (CGRP) receptor or an adrenomedullin receptor, depending on which members of the RAMP family are expressed. In the presence of this (RAMP1) protein, CRLR functions as a CGRP receptor. The RAMP1 protein is involved in the terminal glycosylation, maturation, and presentation of the CGRP receptor to the cell surface. Alternative splicing results in multiple transcript variants encoding different isoforms. | receptor activity modifying protein 1 | RAMP1 | ENSG00000132329 |
| 151887 | NA | coiled-coil domain containing 80 | CCDC80 | ENSG00000091986 |
| 9501 | The protein encoded by this gene plays a direct regulatory role in calcium-ion-dependent exocytosis in both endocrine and exocrine cells and plays a key role in insulin secretion by pancreatic cells. This gene is likely a tumor suppressor. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | rabphilin 3A-like (without C2 domains) | RPH3AL | ENSG00000181031 |
| 1846 | The protein encoded by this gene is a member of the dual specificity protein phosphatase subfamily. These phosphatases inactivate their target kinases by dephosphorylating both the phosphoserine/threonine and phosphotyrosine residues. They negatively regulate members of the mitogen-activated protein (MAP) kinase superfamily (MAPK/ERK, SAPK/JNK, p38), which are associated with cellular proliferation and differentiation. Different members of the family of dual specificity phosphatases show distinct substrate specificities for various MAP kinases, different tissue distribution and subcellular localization, and different modes of inducibility of their expression by extracellular stimuli. This gene product inactivates ERK1, ERK2 and JNK, is expressed in a variety of tissues, and is localized in the nucleus. Two alternatively spliced transcript variants, encoding distinct isoforms, have been observed for this gene. In addition, multiple polyadenylation sites have been reported. | dual specificity phosphatase 4 | DUSP4 | ENSG00000120875 |
| 4192 | This gene encodes a member of a small family of secreted growth factors that binds heparin and responds to retinoic acid. The encoded protein promotes cell growth, migration, and angiogenesis, in particular during tumorigenesis. This gene has been targeted as a therapeutic for a variety of different disorders. Alternatively spliced transcript variants encoding multiple isoforms have been observed. | midkine (neurite growth-promoting factor 2) | MDK | ENSG00000110492 |
| 205 | This gene encodes a member of the adenylate kinase family of enzymes. The encoded protein is localized to the mitochondrial matrix. Adenylate kinases regulate the adenine and guanine nucleotide compositions within a cell by catalyzing the reversible transfer of phosphate group among these nucleotides. Five isozymes of adenylate kinase have been identified in vertebrates. Expression of these isozymes is tissue-specific and developmentally regulated. A pseudogene for this gene has been located on chromosome 17. Three transcript variants encoding the same protein have been identified for this gene. Sequence alignment suggests that the gene defined by NM_013410, NM_203464, and NM_001005353 is located on chromosome 1. | adenylate kinase 4 | AK4 | ENSG00000162433 |
| 284358 | NA | MEF2 activating motif and SAP domain containing transcriptional regulator | MAMSTR | ENSG00000176909 |
| 5468 | This gene encodes a member of the peroxisome proliferator-activated receptor (PPAR) subfamily of nuclear receptors. PPARs form heterodimers with retinoid X receptors (RXRs) and these heterodimers regulate transcription of various genes. Three subtypes of PPARs are known: PPAR-alpha, PPAR-delta, and PPAR-gamma. The protein encoded by this gene is PPAR-gamma and is a regulator of adipocyte differentiation. Additionally, PPAR-gamma has been implicated in the pathology of numerous diseases including obesity, diabetes, atherosclerosis and cancer. Alternatively spliced transcript variants that encode different isoforms have been described. | peroxisome proliferator activated receptor gamma | PPARG | ENSG00000132170 |
| 8165 | The A-kinase anchor proteins (AKAPs) are a group of structurally diverse proteins, which have the common function of binding to the regulatory subunit of protein kinase A (PKA) and confining the holoenzyme to discrete locations within the cell. This gene encodes a member of the AKAP family. The encoded protein binds to type I and type II regulatory subunits of PKA and anchors them to the mitochondrion. This protein is speculated to be involved in the cAMP-dependent signal transduction pathway and in directing RNA to a specific cellular compartment. | A-kinase anchoring protein 1 | AKAP1 | ENSG00000121057 |
| ENSG00000231050 | NA | NA | RP1-140A9.1 | ENSG00000231050 |
| 11149 | This gene encodes a member of the POP family of proteins containing three putative transmembrane domains. This gene is expressed in cardiac and skeletal muscle and may play an important role in development of these tissues. The mouse ortholog may be involved in the regeneration of adult skeletal muscle and may act as a cell adhesion molecule in coronary vasculogenesis. Three transcript variants encoding the same protein have been found for this gene. | blood vessel epicardial substance | BVES | ENSG00000112276 |
| 5802 | The protein encoded by this gene is a member of the protein tyrosine phosphatase (PTP) family. PTPs are known to be signaling molecules that regulate a variety of cellular processes including cell growth, differentiation, mitotic cycle, and oncogenic transformation. This PTP contains an extracellular region, a single transmembrane segment and two tandem intracytoplasmic catalytic domains, and thus represents a receptor-type PTP. The extracellular region of this protein is composed of multiple Ig-like and fibronectin type III-like domains. Studies of the similar gene in mice suggested that this PTP may be involved in cell-cell interaction, primary axonogenesis, and axon guidance during embryogenesis. This PTP has been also implicated in the molecular control of adult nerve repair. Four alternatively spliced transcript variants, which encode distinct proteins, have been reported. | protein tyrosine phosphatase, receptor type S | PTPRS | ENSG00000105426 |
| 29108 | This gene encodes an adaptor protein that is composed of two protein-protein interaction domains: a N-terminal PYRIN-PAAD-DAPIN domain (PYD) and a C-terminal caspase-recruitment domain (CARD). The PYD and CARD domains are members of the six-helix bundle death domain-fold superfamily that mediates assembly of large signaling complexes in the inflammatory and apoptotic signaling pathways via the activation of caspase. In normal cells, this protein is localized to the cytoplasm; however, in cells undergoing apoptosis, it forms ball-like aggregates near the nuclear periphery. Two transcript variants encoding different isoforms have been found for this gene. | PYD and CARD domain containing | PYCARD | ENSG00000103490 |
| 125058 | NA | TBC1 domain family member 16 | TBC1D16 | ENSG00000167291 |
| ENSG00000243829 | NA | NA | CTB-33G10.1 | ENSG00000243829 |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",10,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[11,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| name | query | symbol | X_id | summary | notfound |
|---|---|---|---|---|---|
| serum amyloid A2 | ENSG00000134339 | SAA2 | 6289 | NA | NA |
| SAA2-SAA4 readthrough | ENSG00000255071 | SAA2-SAA4 | 100528017 | This locus represents naturally occurring read-through transcription between the neighboring serum amyloid A2 and serum amyloid A4 genes on chromosome 11. The read-through transcript produces a fusion protein that shares sequence identity with each individual gene product. | NA |
| ankyrin 1 | ENSG00000029534 | ANK1 | 286 | Ankyrins are a family of proteins that link the integral membrane proteins to the underlying spectrin-actin cytoskeleton and play key roles in activities such as cell motility, activation, proliferation, contact and the maintenance of specialized membrane domains. Multiple isoforms of ankyrin with different affinities for various target proteins are expressed in a tissue-specific, developmentally regulated manner. Most ankyrins are typically composed of three structural domains: an amino-terminal domain containing multiple ankyrin repeats; a central region with a highly conserved spectrin binding domain; and a carboxy-terminal regulatory domain which is the least conserved and subject to variation. Ankyrin 1, the prototype of this family, was first discovered in the erythrocytes, but since has also been found in brain and muscles. Mutations in erythrocytic ankyrin 1 have been associated in approximately half of all patients with hereditary spherocytosis. Complex patterns of alternative splicing in the regulatory domain, giving rise to different isoforms of ankyrin 1 have been described. Truncated muscle-specific isoforms of ankyrin 1 resulting from usage of an alternate promoter have also been identified. | NA |
| trophoblast glycoprotein | ENSG00000146242 | TPBG | 7162 | This gene encodes a leucine-rich transmembrane glycoprotein that may be involved in cell adhesion. The encoded protein is an oncofetal antigen that is specific to trophoblast cells. In adults this protein is highly expressed in many tumor cells and is associated with poor clinical outcome in numerous cancers. Alternate splicing in the 5’ UTR results in multiple transcript variants that encode the same protein. | NA |
| glutathione S-transferase alpha 1 | ENSG00000243955 | GSTA1 | 2938 | This gene encodes a member of a family of enzymes that function to add glutathione to target electrophilic compounds, including carcinogens, therapeutic drugs, environmental toxins, and products of oxidative stress. This action is an important step in detoxification of these compounds. This subfamily of enzymes has a particular role in protecting cells from reactive oxygen species and the products of peroxidation. Polymorphisms in this gene influence the ability of individuals to metabolize different drugs. This gene is located in a cluster of similar genes and pseudogenes on chromosome 6. Alternative splicing results in multiple transcript variants. | NA |
| serum amyloid A1 | ENSG00000173432 | SAA1 | 6288 | This gene encodes a member of the serum amyloid A family of apolipoproteins. The encoded preproprotein is proteolytically processed to generate the mature protein. This protein is a major acute phase protein that is highly expressed in response to inflammation and tissue injury. This protein also plays an important role in HDL metabolism and cholesterol homeostasis. High levels of this protein are associated with chronic inflammatory diseases including atherosclerosis, rheumatoid arthritis, Alzheimer’s disease and Crohn’s disease. This protein may also be a potential biomarker for certain tumors. Alternate splicing results in multiple transcript variants that encode the same protein. A pseudogene of this gene is found on chromosome 11. | NA |
| tubulin beta 6 class V | ENSG00000176014 | TUBB6 | 84617 | NA | NA |
| NOTCH1 associated lncRNA in T-cell acute lymphoblastic leukemia 1 | ENSG00000237886 | NALT1 | ENSG00000237886 | NA | NA |
| vitronectin | ENSG00000109072 | VTN | 7448 | The protein encoded by this gene is a member of the pexin family. It is found in serum and tissues and promotes cell adhesion and spreading, inhibits the membrane-damaging effect of the terminal cytolytic complement pathway, and binds to several serpin serine protease inhibitors. It is a secreted protein and exists in either a single chain form or a clipped, two chain form held together by a disulfide bond. | NA |
| NA | ENSG00000270670 | RP11-248C1.3 | ENSG00000270670 | NA | NA |
| NA | ENSG00000242198 | CTD-2235C13.1 | ENSG00000242198 | NA | NA |
| NA | ENSG00000251196 | RP11-54F2.1 | ENSG00000251196 | NA | NA |
| neuronal calcium sensor 1 | ENSG00000107130 | NCS1 | 23413 | This gene is a member of the neuronal calcium sensor gene family, which encode calcium-binding proteins expressed predominantly in neurons. The protein encoded by this gene regulates G protein-coupled receptor phosphorylation in a calcium-dependent manner and can substitute for calmodulin. The protein is associated with secretory granules and modulates synaptic transmission and synaptic plasticity. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| NA | ENSG00000233593 | RP4-665J23.1 | ENSG00000233593 | NA | NA |
| cofilin 1 (non-muscle) pseudogene 5 | ENSG00000213830 | CFL1P5 | ENSG00000213830 | NA | NA |
| ribosomal protein L5 pseudogene 23 | ENSG00000240395 | RPL5P23 | ENSG00000240395 | NA | NA |
| epithelial cell adhesion molecule | ENSG00000119888 | EPCAM | 4072 | This gene encodes a carcinoma-associated antigen and is a member of a family that includes at least two type I membrane proteins. This antigen is expressed on most normal epithelial cells and gastrointestinal carcinomas and functions as a homotypic calcium-independent cell adhesion molecule. The antigen is being used as a target for immunotherapy treatment of human carcinomas. Mutations in this gene result in congenital tufting enteropathy. | NA |
| potassium calcium-activated channel subfamily M alpha 1 | ENSG00000156113 | KCNMA1 | 3778 | MaxiK channels are large conductance, voltage and calcium-sensitive potassium channels which are fundamental to the control of smooth muscle tone and neuronal excitability. MaxiK channels can be formed by 2 subunits: the pore-forming alpha subunit, which is the product of this gene, and the modulatory beta subunit. Intracellular calcium regulates the physical association between the alpha and beta subunits. Alternatively spliced transcript variants encoding different isoforms have been identified. | NA |
| NA | ENSG00000253364 | RP11-731F5.2 | ENSG00000253364 | NA | NA |
| erythrocyte membrane protein band 4.1 | ENSG00000159023 | EPB41 | 2035 | The protein encoded by this gene, together with spectrin and actin, constitute the red cell membrane cytoskeletal network. This complex plays a critical role in erythrocyte shape and deformability. Mutations in this gene are associated with type 1 elliptocytosis (EL1). Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | NA |
| NA | ENSG00000180672 | NA | NA | NA | TRUE |
| calsequestrin 2 | ENSG00000118729 | CASQ2 | 845 | The protein encoded by this gene specifies the cardiac muscle family member of the calsequestrin family. Calsequestrin is localized to the sarcoplasmic reticulum in cardiac and slow skeletal muscle cells. The protein is a calcium binding protein that stores calcium for muscle function. Mutations in this gene cause stress-induced polymorphic ventricular tachycardia, also referred to as catecholaminergic polymorphic ventricular tachycardia 2 (CPVT2), a disease characterized by bidirectional ventricular tachycardia that may lead to cardiac arrest. | NA |
| apolipoprotein A2 | ENSG00000158874 | APOA2 | 336 | This gene encodes apolipoprotein (apo-) A-II, which is the second most abundant protein of the high density lipoprotein particles. The protein is found in plasma as a monomer, homodimer, or heterodimer with apolipoprotein D. Defects in this gene may result in apolipoprotein A-II deficiency or hypercholesterolemia. | NA |
| transmembrane protein 54 | ENSG00000121900 | TMEM54 | 113452 | NA | NA |
| phosphatidylinositol glycan anchor biosynthesis class H pseudogene 1 | ENSG00000259657 | PIGHP1 | ENSG00000259657 | NA | NA |
| NA | ENSG00000234638 | AC053503.6 | ENSG00000234638 | NA | NA |
| cell death inducing DFFA like effector c | ENSG00000187288 | CIDEC | 63924 | This gene encodes a member of the cell death-inducing DNA fragmentation factor-like effector family. Members of this family play important roles in apoptosis. The encoded protein promotes lipid droplet formation in adipocytes and may mediate adipocyte apoptosis. This gene is regulated by insulin and its expression is positively correlated with insulin sensitivity. Mutations in this gene may contribute to insulin resistant diabetes. A pseudogene of this gene is located on the short arm of chromosome 3. Alternatively spliced transcript variants that encode different isoforms have been observed for this gene. | NA |
| NA | ENSG00000255139 | AP000442.1 | ENSG00000255139 | NA | NA |
| caveolin 1 | ENSG00000105974 | CAV1 | 857 | The scaffolding protein encoded by this gene is the main component of the caveolae plasma membranes found in most cell types. The protein links integrin subunits to the tyrosine kinase FYN, an initiating step in coupling integrins to the Ras-ERK pathway and promoting cell cycle progression. The gene is a tumor suppressor gene candidate and a negative regulator of the Ras-p42/44 mitogen-activated kinase cascade. Caveolin 1 and caveolin 2 are located next to each other on chromosome 7 and express colocalizing proteins that form a stable hetero-oligomeric complex. Mutations in this gene have been associated with Berardinelli-Seip congenital lipodystrophy. Alternatively spliced transcripts encode alpha and beta isoforms of caveolin 1. | NA |
| long intergenic non-protein coding RNA 865 | ENSG00000232229 | LINC00865 | 643529 | NA | NA |
| NA | ENSG00000224818 | RP11-134G8.10 | ENSG00000224818 | NA | NA |
| sulfotransferase family 1A member 2 | ENSG00000197165 | SULT1A2 | 6799 | Sulfotransferase enzymes catalyze the sulfate conjugation of many hormones, neurotransmitters, drugs, and xenobiotic compounds. These cytosolic enzymes are different in their tissue distributions and substrate specificities. The gene structure (number and length of exons) is similar among family members. This gene encodes one of two phenol sulfotransferases with thermostable enzyme activity. Two alternatively spliced variants that encode the same protein have been described. | NA |
| B-cell translocation gene 1, anti-proliferative | ENSG00000133639 | BTG1 | 694 | This gene is a member of an anti-proliferative gene family that regulates cell growth and differentiation. Expression of this gene is highest in the G0/G1 phases of the cell cycle and downregulated when cells progressed through G1. The encoded protein interacts with several nuclear receptors, and functions as a coactivator of cell differentiation. This locus has been shown to be involved in a t(8;12)(q24;q22) chromosomal translocation in a case of B-cell chronic lymphocytic leukemia. | NA |
| TNF alpha induced protein 8 like 3 | ENSG00000183578 | TNFAIP8L3 | 388121 | NA | NA |
| nocturnin | ENSG00000151014 | NOCT | 25819 | The protein encoded by this gene is highly similar to Nocturnin, a gene identified as a circadian clock regulated gene in Xenopus laevis. This protein and Nocturnin protein share similarity with the C-terminal domain of a yeast transcription factor, carbon catabolite repression 4 (CCR4). The mRNA abundance of a similar gene in mouse has been shown to exhibit circadian rhythmicity, which suggests a role for this protein in clock function or as a circadian clock effector. | NA |
| NA | ENSG00000261337 | NA | NA | NA | TRUE |
| ribokinase | ENSG00000171174 | RBKS | 64080 | This gene encodes a member of the carbohydrate kinase PfkB family. The encoded protein phosphorylates ribose to form ribose-5-phosphate in the presence of ATP and magnesium as a first step in ribose metabolism. Alternative splicing results in multiple transcript variants. | NA |
| NA | ENSG00000236234 | AC091132.1 | ENSG00000236234 | NA | NA |
| NA | ENSG00000236213 | AC006369.2 | ENSG00000236213 | NA | NA |
| interferon induced transmembrane protein 10 | ENSG00000244242 | IFITM10 | 402778 | NA | NA |
| granzyme M | ENSG00000197540 | GZMM | 3004 | Human natural killer (NK) cells and activated lymphocytes express and store a distinct subset of neutral serine proteases together with proteoglycans and other immune effector molecules in large cytoplasmic granules. These serine proteases are collectively termed granzymes and include 4 distinct gene products: granzyme A, granzyme B, granzyme H, and the protein encoded by this gene, granzyme M. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| nucleophosmin 1 (nucleolar phosphoprotein B23, numatrin) pseudogene 37 | ENSG00000219085 | NPM1P37 | ENSG00000219085 | NA | NA |
| desmin | ENSG00000175084 | DES | 1674 | This gene encodes a muscle-specific class III intermediate filament. Homopolymers of this protein form a stable intracytoplasmic filamentous network connecting myofibrils to each other and to the plasma membrane. Mutations in this gene are associated with desmin-related myopathy, a familial cardiac and skeletal myopathy (CSM), and with distal myopathies. | NA |
| fatty acid binding protein 4 | ENSG00000170323 | FABP4 | 2167 | FABP4 encodes the fatty acid binding protein found in adipocytes. Fatty acid binding proteins are a family of small, highly conserved, cytoplasmic proteins that bind long-chain fatty acids and other hydrophobic ligands. It is thought that FABPs roles include fatty acid uptake, transport, and metabolism. | NA |
| long intergenic non-protein coding RNA 1160 | ENSG00000231346 | LINC01160 | ENSG00000231346 | NA | NA |
| NA | ENSG00000261136 | RP11-37C7.3 | ENSG00000261136 | NA | NA |
| ribosomal protein S20 pseudogene 22 | ENSG00000239218 | RPS20P22 | ENSG00000239218 | NA | NA |
| mesenteric estrogen dependent adipogenesis | ENSG00000102802 | MEDAG | 84935 | NA | NA |
| shisa family member 3 | ENSG00000178343 | SHISA3 | 152573 | NA | NA |
| RNA, 7SL, cytoplasmic 608, pseudogene | ENSG00000239884 | RN7SL608P | ENSG00000239884 | NA | NA |
| protein phosphatase, Mg2+/Mn2+ dependent 1H | ENSG00000111110 | PPM1H | 57460 | NA | NA |
| syntaxin binding protein 6 | ENSG00000168952 | STXBP6 | 29091 | STXBP6 binds components of the SNARE complex (see MIM 603215) and may be involved in regulating SNARE complex formation (Scales et al., 2002 [PubMed 12145319]). | NA |
| NA | ENSG00000229512 | AC068580.5 | ENSG00000229512 | NA | NA |
| insulin induced gene 1 | ENSG00000186480 | INSIG1 | 3638 | Oxysterols regulate cholesterol homeostasis through the liver X receptor (LXR)- and sterol regulatory element-binding protein (SREBP)-mediated signaling pathways. This gene is an insulin-induced gene. It encodes an endoplasmic reticulum (ER) membrane protein that plays a critical role in regulating cholesterol concentrations in cells. This protein binds to the sterol-sensing domains of SREBP cleavage-activating protein (SCAP) and HMG CoA reductase, and is essential for the sterol-mediated trafficking of the two proteins. Alternatively spliced transcript variants encoding distinct isoforms have been observed. | NA |
| activin A receptor like type 1 | ENSG00000139567 | ACVRL1 | 94 | This gene encodes a type I cell-surface receptor for the TGF-beta superfamily of ligands. It shares with other type I receptors a high degree of similarity in serine-threonine kinase subdomains, a glycine- and serine-rich region (called the GS domain) preceding the kinase domain, and a short C-terminal tail. The encoded protein, sometimes termed ALK1, shares similar domain structures with other closely related ALK or activin receptor-like kinase proteins that form a subfamily of receptor serine/threonine kinases. Mutations in this gene are associated with hemorrhagic telangiectasia type 2, also known as Rendu-Osler-Weber syndrome 2. | NA |
| NA | ENSG00000234329 | RP11-767N6.2 | ENSG00000234329 | NA | NA |
| RAB36, member RAS oncogene family | ENSG00000100228 | RAB36 | 9609 | NA | NA |
| activated leukocyte cell adhesion molecule | ENSG00000170017 | ALCAM | 214 | This gene encodes activated leukocyte cell adhesion molecule (ALCAM), also known as CD166 (cluster of differentiation 166), which is a member of a subfamily of immunoglobulin receptors with five immunoglobulin-like domains (VVC2C2C2) in the extracellular domain. This protein binds to T-cell differentiation antigene CD6, and is implicated in the processes of cell adhesion and migration. Multiple alternatively spliced transcript variants encoding different isoforms have been found. | NA |
| repulsive guidance molecule family member a | ENSG00000182175 | RGMA | 56963 | This gene encodes a member of the repulsive guidance molecule family. The encoded protein is a glycosylphosphatidylinositol-anchored glycoprotein that functions as an axon guidance protein in the developing and adult central nervous system. This protein may also function as a tumor suppressor in some cancers. Alternate splicing results in multiple transcript variants. | NA |
| phospholipase A2 group IVB | ENSG00000243708 | PLA2G4B | ENSG00000243708 | NA | NA |
| myosin, heavy chain 7, cardiac muscle, beta | ENSG00000092054 | MYH7 | 4625 | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. | NA |
| prolactin | ENSG00000172179 | PRL | 5617 | This gene encodes the anterior pituitary hormone prolactin. This secreted hormone is a growth regulator for many tissues, including cells of the immune system. It may also play a role in cell survival by suppressing apoptosis, and it is essential for lactation. Alternative splicing results in multiple transcript variants that encode the same protein. | NA |
| hydroxysteroid 17-beta dehydrogenase 6 | ENSG00000025423 | HSD17B6 | 8630 | The protein encoded by this gene has both oxidoreductase and epimerase activities and is involved in androgen catabolism. The oxidoreductase activity can convert 3 alpha-adiol to dihydrotestosterone, while the epimerase activity can convert androsterone to epi-androsterone. Both reactions use NAD+ as the preferred cofactor. This gene is a member of the retinol dehydrogenase family. | NA |
| NA | ENSG00000266498 | RP11-45M22.5 | ENSG00000266498 | NA | NA |
| atypical chemokine receptor 3 | ENSG00000144476 | ACKR3 | 57007 | This gene encodes a member of the G-protein coupled receptor family. Although this protein was earlier thought to be a receptor for vasoactive intestinal peptide (VIP), it is now considered to be an orphan receptor, in that its endogenous ligand has not been identified. The protein is also a coreceptor for human immunodeficiency viruses (HIV). Translocations involving this gene and HMGA2 on chromosome 12 have been observed in lipomas. | NA |
| acyl-CoA thioesterase 4 | ENSG00000177465 | ACOT4 | 122970 | NA | NA |
| paternally expressed 3 | ENSG00000198300 | PEG3 | 5178 | In human, ZIM2 and PEG3 are treated as two distinct genes though they share multiple 5’ exons and a common promoter and both genes are paternally expressed (PMID:15203203). Alternative splicing events connect their shared 5’ exons either with the remaining 4 exons unique to ZIM2, or with the remaining 2 exons unique to PEG3. In contrast, in other mammals ZIM2 does not undergo imprinting and, in mouse, cow, and likely other mammals as well, the ZIM2 and PEG3 genes do not share exons. Human PEG3 protein belongs to the Kruppel C2H2-type zinc finger protein family. PEG3 may play a role in cell proliferation and p53-mediated apoptosis. PEG3 has also shown tumor suppressor activity and tumorigenesis in glioma and ovarian cells. Alternative splicing of this PEG3 gene results in multiple transcript variants encoding distinct isoforms. | NA |
| NA | ENSG00000261759 | RP11-626G11.3 | ENSG00000261759 | NA | NA |
| hexokinase 3 | ENSG00000160883 | HK3 | 3101 | Hexokinases phosphorylate glucose to produce glucose-6-phosphate, the first step in most glucose metabolism pathways. This gene encodes hexokinase 3. Similar to hexokinases 1 and 2, this allosteric enzyme is inhibited by its product glucose-6-phosphate. | NA |
| zinc finger FYVE-type containing 28 | ENSG00000159733 | ZFYVE28 | 57732 | NA | NA |
| cytokine receptor like factor 1 | ENSG00000006016 | CRLF1 | 9244 | This gene encodes a member of the cytokine type I receptor family. The protein forms a secreted complex with cardiotrophin-like cytokine factor 1 and acts on cells expressing ciliary neurotrophic factor receptors. The complex can promote survival of neuronal cells. Mutations in this gene result in Crisponi syndrome and cold-induced sweating syndrome. | NA |
| testin LIM domain protein | ENSG00000135269 | TES | 26136 | Cancer-associated chromosomal changes often involve regions containing fragile sites. This gene maps to a commom fragile site on chromosome 7q31.2 designated FRA7G. This gene is similar to mouse Testin, a testosterone-responsive gene encoding a Sertoli cell secretory protein containing three LIM domains. LIM domains are double zinc-finger motifs that mediate protein-protein interactions between transcription factors, cytoskeletal proteins and signaling proteins. This protein is a negative regulator of cell growth and may act as a tumor suppressor. This scaffold protein may also play a role in cell adhesion, cell spreading and in the reorganization of the actin cytoskeleton. Multiple protein isoforms are encoded by transcript variants of this gene. | NA |
| peptide YY, 2 (pseudogene) | ENSG00000237575 | PYY2 | 23615 | NA | NA |
| basic leucine zipper ATF-like transcription factor | ENSG00000156127 | BATF | 10538 | The protein encoded by this gene is a nuclear basic leucine zipper protein that belongs to the AP-1/ATF superfamily of transcription factors. The leucine zipper of this protein mediates dimerization with members of the Jun family of proteins. This protein is thought to be a negative regulator of AP-1/ATF transcriptional events. | NA |
| nephrocystin 1 | ENSG00000144061 | NPHP1 | 4867 | This gene encodes a protein with src homology domain 3 (SH3) patterns. This protein interacts with Crk-associated substrate, and it appears to function in the control of cell division, as well as in cell-cell and cell-matrix adhesion signaling, likely as part of a multifunctional complex localized in actin- and microtubule-based structures. Mutations in this gene cause familial juvenile nephronophthisis type 1, a kidney disorder involving both tubules and glomeruli. Defects in this gene are also associated with Senior-Loken syndrome type 1, also referred to as juvenile nephronophthisis with Leber amaurosis, which is characterized by kidney and eye disease, and with Joubert syndrome type 4, which is characterized by cerebellar ataxia, oculomotor apraxia, psychomotor delay and neonatal breathing abnormalities, sometimes including retinal dystrophy and renal disease. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| cysteine and glycine rich protein 3 | ENSG00000129170 | CSRP3 | 8048 | This gene encodes a member of the CSRP family of LIM domain proteins, which may be involved in regulatory processes important for development and cellular differentiation. The LIM/double zinc-finger motif found in this protein is found in a group of proteins with critical functions in gene regulation, cell growth, and somatic differentiation. Mutations in this gene are thought to cause heritable forms of hypertrophic cardiomyopathy (HCM) and dilated cardiomyopathy (DCM) in humans. Alternatively spliced transcript variants with different 5’ UTR, but encoding the same protein, have been found for this gene. | NA |
| ADP ribosylation factor like GTPase 4D | ENSG00000175906 | ARL4D | 379 | ADP-ribosylation factor 4D is a member of the ADP-ribosylation factor family of GTP-binding proteins. ARL4D is closely similar to ARL4A and ARL4C and each has a nuclear localization signal and an unusually high guanine nucleotide exchange rate. This protein may play a role in membrane-associated intracellular trafficking. Mutations in this gene have been associated with Bardet-Biedl syndrome (BBS). | NA |
| NA | ENSG00000273018 | CTD-2303H24.2 | ENSG00000273018 | NA | NA |
| leucine rich repeats and immunoglobulin like domains 3 | ENSG00000139263 | LRIG3 | 121227 | NA | NA |
| low density lipoprotein receptor | ENSG00000130164 | LDLR | 3949 | The low density lipoprotein receptor (LDLR) gene family consists of cell surface proteins involved in receptor-mediated endocytosis of specific ligands. Low density lipoprotein (LDL) is normally bound at the cell membrane and taken into the cell ending up in lysosomes where the protein is degraded and the cholesterol is made available for repression of microsomal enzyme 3-hydroxy-3-methylglutaryl coenzyme A (HMG CoA) reductase, the rate-limiting step in cholesterol synthesis. At the same time, a reciprocal stimulation of cholesterol ester synthesis takes place. Mutations in this gene cause the autosomal dominant disorder, familial hypercholesterolemia. Alternate splicing results in multiple transcript variants. | NA |
| copine 5 | ENSG00000124772 | CPNE5 | 57699 | Calcium-dependent membrane-binding proteins may regulate molecular events at the interface of the cell membrane and cytoplasm. This gene is one of several genes that encode a calcium-dependent protein containing two N-terminal type II C2 domains and an integrin A domain-like sequence in the C-terminus. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene. More variants may exist, but their full-length natures could not be determined. | NA |
| macrophage scavenger receptor 1 | ENSG00000038945 | MSR1 | 4481 | This gene encodes the class A macrophage scavenger receptors, which include three different types (1, 2, 3) generated by alternative splicing of this gene. These receptors or isoforms are macrophage-specific trimeric integral membrane glycoproteins and have been implicated in many macrophage-associated physiological and pathological processes including atherosclerosis, Alzheimer’s disease, and host defense. The isoforms type 1 and type 2 are functional receptors and are able to mediate the endocytosis of modified low density lipoproteins (LDLs). The isoform type 3 does not internalize modified LDL (acetyl-LDL) despite having the domain shown to mediate this function in the types 1 and 2 isoforms. It has an altered intracellular processing and is trapped within the endoplasmic reticulum, making it unable to perform endocytosis. The isoform type 3 can inhibit the function of isoforms type 1 and type 2 when co-expressed, indicating a dominant negative effect and suggesting a mechanism for regulation of scavenger receptor activity in macrophages. | NA |
| NA | ENSG00000264924 | RP11-799B12.2 | ENSG00000264924 | NA | NA |
| complement component 4B (Chido blood group) | ENSG00000224389 | C4B | 721 | This gene encodes the basic form of complement factor 4, part of the classical activation pathway. The protein is expressed as a single chain precursor which is proteolytically cleaved into a trimer of alpha, beta, and gamma chains prior to secretion. The trimer provides a surface for interaction between the antigen-antibody complex and other complement components. The alpha chain may be cleaved to release C4 anaphylatoxin, a mediator of local inflammation. Deficiency of this protein is associated with systemic lupus erythematosus. This gene localizes to the major histocompatibility complex (MHC) class III region on chromosome 6. Varying haplotypes of this gene cluster exist, such that individuals may have 1, 2, or 3 copies of this gene. In addition, this gene exists as a long form and a short form due to the presence or absence of a 6.4 kb endogenous HERV-K retrovirus in intron 9. | NA |
| chromogranin A | ENSG00000100604 | CHGA | 1113 | The protein encoded by this gene is a member of the chromogranin/secretogranin family of neuroendocrine secretory proteins. It is found in secretory vesicles of neurons and endocrine cells. This gene product is a precursor to three biologically active peptides; vasostatin, pancreastatin, and parastatin. These peptides act as autocrine or paracrine negative modulators of the neuroendocrine system. Two other peptides, catestatin and chromofungin, have antimicrobial activity and antifungal activity, respectively. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| FK506 binding protein 5 | ENSG00000096060 | FKBP5 | 2289 | The protein encoded by this gene is a member of the immunophilin protein family, which play a role in immunoregulation and basic cellular processes involving protein folding and trafficking. This encoded protein is a cis-trans prolyl isomerase that binds to the immunosuppressants FK506 and rapamycin. It is thought to mediate calcineurin inhibition. It also interacts functionally with mature hetero-oligomeric progesterone receptor complexes along with the 90 kDa heat shock protein and P23 protein. This gene has been found to have multiple polyadenylation sites. Alternative splicing results in multiple transcript variants. | NA |
| cerebellin 3 precursor | ENSG00000139899 | CBLN3 | 643866 | Members of the precerebellin family, such as CBLN3, contain a cerebellin motif (see CBLN1; MIM 600432) and a C-terminal C1q signature domain (see MIM 120550) that mediates trimeric assembly of atypical collagen complexes. However, precerebellins do not contain a collagen motif, suggesting that they are not conventional components of the extracellular matrix (Pang et al., 2000 [PubMed 10964938]). | NA |
| adenosylmethionine decarboxylase 1 pseudogene 3 | ENSG00000249286 | AMD1P3 | ENSG00000249286 | NA | NA |
| epithelial membrane protein 1 | ENSG00000134531 | EMP1 | 2012 | NA | NA |
| Epstein-Barr virus induced 3 | ENSG00000105246 | EBI3 | 10148 | This gene was identified by its induced expression in B lymphocytes in response Epstein-Barr virus infection. It encodes a secreted glycoprotein belonging to the hematopoietin receptor family, and heterodimerizes with a 28 kDa protein to form interleukin 27 (IL-27). IL-27 regulates T cell and inflammatory responses, in part by activating the Jak/STAT pathway of CD4+ T cells. | NA |
| 5’-aminolevulinate synthase 1 | ENSG00000023330 | ALAS1 | 211 | This gene encodes the mitochondrial enzyme which is catalyzes the rate-limiting step in heme (iron-protoporphyrin) biosynthesis. The enzyme encoded by this gene is the housekeeping enzyme; a separate gene encodes a form of the enzyme that is specific for erythroid tissue. The level of the mature encoded protein is regulated by heme: high levels of heme down-regulate the mature enzyme in mitochondria while low heme levels up-regulate. A pseudogene of this gene is located on chromosome 12. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| NA | ENSG00000262905 | RP5-1029F21.2 | ENSG00000262905 | NA | NA |
| serine peptidase inhibitor, Kunitz type, 2 | ENSG00000167642 | SPINT2 | 10653 | This gene encodes a transmembrane protein with two extracellular Kunitz domains that inhibits a variety of serine proteases. The protein inhibits HGF activator which prevents the formation of active hepatocyte growth factor. This gene is a putative tumor suppressor, and mutations in this gene result in congenital sodium diarrhea. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| matrix Gla protein | ENSG00000111341 | MGP | 4256 | The protein encoded by this gene is secreted and likely acts as an inhibitor of bone formation. The encoded protein is found in the organic matrix of bone and cartilage. Defects in this gene are a cause of Keutel syndrome (KS). Two transcript variants encoding different isoforms have been found for this gene. | NA |
| myomesin 2 | ENSG00000036448 | MYOM2 | 9172 | The giant protein titin, together with its associated proteins, interconnects the major structure of sarcomeres, the M bands and Z discs. The C-terminal end of the titin string extends into the M line, where it binds tightly to M-band constituents of apparent molecular masses of 190 kD and 165 kD. The predicted MYOM2 protein contains 1,465 amino acids. Like MYOM1, MYOM2 has a unique N-terminal domain followed by 12 repeat domains with strong homology to either fibronectin type III or immunoglobulin C2 domains. Protein sequence comparisons suggested that the MYOM2 protein and bovine M protein are identical. | NA |
| keratin 2 | ENSG00000172867 | KRT2 | 3849 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | NA |
| SEL1L family member 3 | ENSG00000091490 | SEL1L3 | 23231 | NA | NA |
| cysteine rich secretory protein LCCL domain containing 2 | ENSG00000103196 | CRISPLD2 | 83716 | NA | NA |
| myosin light chain, phosphorylatable, fast skeletal muscle | ENSG00000180209 | MYLPF | 29895 | NA | NA |
| NA | ENSG00000232450 | RP4-730K3.3 | ENSG00000232450 | NA | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",11,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[12,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| summary | query | name | X_id | symbol | notfound |
|---|---|---|---|---|---|
| The protein encoded by this gene is a transcriptional activator involved in cell proliferation. The encoded protein is phosphorylated in M phase and regulates the expression of several cell cycle genes, such as cyclin B1 and cyclin D1. Several transcript variants encoding different isoforms have been found for this gene. | ENSG00000111206 | forkhead box M1 | 2305 | FOXM1 | NA |
| NA | ENSG00000249790 | NA | ENSG00000249790 | RP11-20D14.6 | NA |
| NA | ENSG00000237649 | kinesin family member C1 | 3833 | KIFC1 | NA |
| This gene encodes a member of the runt domain-containing family of transcription factors. A heterodimer of this protein and a beta subunit forms a complex that binds to the core DNA sequence 5’-PYGPYGGT-3’ found in a number of enhancers and promoters, and can either activate or suppress transcription. It also interacts with other transcription factors. It functions as a tumor suppressor, and the gene is frequently deleted or transcriptionally silenced in cancer. Alternative splicing results in multiple transcript variants. | ENSG00000020633 | runt related transcription factor 3 | 864 | RUNX3 | NA |
| The protein encoded by this gene is a glutathione-independent prostaglandin D synthase that catalyzes the conversion of prostaglandin H2 (PGH2) to postaglandin D2 (PGD2). PGD2 functions as a neuromodulator as well as a trophic factor in the central nervous system. PGD2 is also involved in smooth muscle contraction/relaxation and is a potent inhibitor of platelet aggregation. This gene is preferentially expressed in brain. Studies with transgenic mice overexpressing this gene suggest that this gene may be also involved in the regulation of non-rapid eye movement sleep. | ENSG00000107317 | prostaglandin D2 synthase | 5730 | PTGDS | NA |
| NA | ENSG00000228477 | NA | ENSG00000228477 | RP3-342P20.2 | NA |
| This gene encodes an extracellular matrix protein with a spatially and temporally restricted tissue distribution. This protein is homohexameric with disulfide-linked subunits, and contains multiple EGF-like and fibronectin type-III domains. It is implicated in guidance of migrating neurons as well as axons during development, synaptic plasticity, and neuronal regeneration. | ENSG00000041982 | tenascin C | 3371 | TNC | NA |
| This gene encodes a member of the galactose-3-O-sulfotransferase protein family. The product of this gene catalyzes sulfonation by transferring a sulfate to the C-3’ position of galactose residues in O-linked glycoproteins. This enzyme is highly specific for core 1 structures, with asialofetuin, Gal-beta-1,3-GalNAc and Gal-beta-1,3 (GlcNAc-beta-1,6)GalNAc being good substrates. | ENSG00000197093 | galactose-3-O-sulfotransferase 4 | 79690 | GAL3ST4 | NA |
| This gene encodes a member of the carboxypeptidase A family of zinc metalloproteases. This enzyme is produced in the pancreas and preferentially cleaves C-terminal branched-chain and aromatic amino acids from dietary proteins. This gene and several family members are present in a gene cluster on chromosome 7. Mutations in this gene may be linked to chronic pancreatitis, while elevated protein levels may be associated with pancreatic cancer. | ENSG00000091704 | carboxypeptidase A1 | 1357 | CPA1 | NA |
| U3 RNA, an abundant small nucleolar RNA (snoRNA), is thought to play a role in the processing of ribosomal RNA precursors (Bernstein et al., 1983 [PubMed 6186397]). | ENSG00000263934 | small nucleolar RNA, C/D box 3A | 780851 | SNORD3A | NA |
| NA | ENSG00000233695 | GAS6 antisense RNA 1 | ENSG00000233695 | GAS6-AS1 | NA |
| The protein encoded by this gene belongs to a family of sarcomeric proteins that bind to calcineurin, a phosphatase involved in calcium-dependent signal transduction in diverse cell types. These family members tether calcineurin to alpha-actinin at the z-line of the sarcomere of cardiac and skeletal muscle cells, and thus they are important for calcineurin signaling. Mutations in this gene cause cardiomyopathy familial hypertrophic type 16, a hereditary heart disorder. | ENSG00000172399 | myozenin 2 | 51778 | MYOZ2 | NA |
| NA | ENSG00000034063 | NA | NA | NA | TRUE |
| NA | ENSG00000260686 | NA | ENSG00000260686 | CTB-36H16.2 | NA |
| This gene encodes an extracellular matrix protein, which belongs to the fibulin family. This protein binds various extracellular ligands and calcium. It may play a role during organ development, in particular, during the differentiation of heart, skeletal and neuronal structures. Alternatively spliced transcript variants encoding different isoforms have been identified. | ENSG00000163520 | fibulin 2 | 2199 | FBLN2 | NA |
| This gene encodes a member of the KDEL endoplasmic reticulum protein retention receptor family. Retention of resident soluble proteins in the lumen of the endoplasmic reticulum (ER) is achieved in both yeast and animal cells by their continual retrieval from the cis-Golgi, or a pre-Golgi compartment. Sorting of these proteins is dependent on a C-terminal tetrapeptide signal, usually lys-asp-glu-leu (KDEL) in animal cells, and his-asp-glu-leu (HDEL) in S. cerevisiae. This process is mediated by a receptor that recognizes, and binds the tetrapeptide-containing protein, and returns it to the ER. In yeast, the sorting receptor encoded by a single gene, ERD2, is a seven-transmembrane protein. Unlike yeast, several human homologs of the ERD2 gene, constituting the KDEL receptor gene family, have been described. KDELR3 was the third member of the family to be identified. Alternate splicing results in multiple transcript variants. | ENSG00000100196 | KDEL endoplasmic reticulum protein retention receptor 3 | 11015 | KDELR3 | NA |
| This gene is a member of the cytidine deaminase gene family. It is one of seven related genes or pseudogenes found in a cluster thought to result from gene duplication, on chromosome 22. Members of the cluster encode proteins that are structurally and functionally related to the C to U RNA-editing cytidine deaminase APOBEC1. It is thought that the proteins may be RNA editing enzymes and have roles in growth or cell cycle control. | ENSG00000244509 | apolipoprotein B mRNA editing enzyme catalytic subunit 3C | 27350 | APOBEC3C | NA |
| NA | ENSG00000213846 | NA | ENSG00000213846 | AC098614.2 | NA |
| RMI2 is a component of the BLM (RECQL3; MIM 604610) complex, which plays a role in homologous recombination-dependent DNA repair and is essential for genome stability (Xu et al., 2008 [PubMed 18923082]). | ENSG00000175643 | RecQ mediated genome instability 2 | 116028 | RMI2 | NA |
| NA | ENSG00000068489 | proline rich 11 | 55771 | PRR11 | NA |
| Thymidylate synthase catalyzes the methylation of deoxyuridylate to deoxythymidylate using 5,10-methylenetetrahydrofolate (methylene-THF) as a cofactor. This function maintains the dTMP (thymidine-5-prime monophosphate) pool critical for DNA replication and repair. The enzyme has been of interest as a target for cancer chemotherapeutic agents. It is considered to be the primary site of action for 5-fluorouracil, 5-fluoro-2-prime-deoxyuridine, and some folate analogs. Expression of this gene and that of a naturally occuring antisense transcript rTSalpha (GeneID:55556) vary inversely when cell-growth progresses from late-log to plateau phase. | ENSG00000176890 | thymidylate synthetase | 7298 | TYMS | NA |
| The protein encoded by this gene is involved in cell motility. It is expressed in breast tissue and together with other proteins, it forms a complex with BRCA1 and BRCA2, thus is potentially associated with higher risk of breast cancer. Alternatively spliced transcript variants encoding different isoforms have been noted for this gene. | ENSG00000072571 | hyaluronan mediated motility receptor | 3161 | HMMR | NA |
| NA | ENSG00000251196 | NA | ENSG00000251196 | RP11-54F2.1 | NA |
| NA | ENSG00000175768 | translocase of outer mitochondrial membrane 5 | 401505 | TOMM5 | NA |
| This gene encodes a member of the semaphorin family of proteins. The encoded preproprotein is proteolytically processed to generate the mature glycosylphosphatidylinositol (GPI)-anchored membrane glycoprotein. The encoded protein is found on activated lymphocytes and erythrocytes and may be involved in immunomodulatory and neuronal processes. The encoded protein carries the John Milton Hagen (JMH) blood group antigens. Mutations in this gene may be associated with reduced bone mineral density (BMD). Alternative splicing results in multiple transcript variants, at least one of which encodes an isoform that is proteolytically processed. | ENSG00000138623 | semaphorin 7A (John Milton Hagen blood group) | 8482 | SEMA7A | NA |
| This gene encodes a member of the adaptor complexes small subunit family. The encoded protein is a subunit of the coatomer protein complex, a seven-subunit complex that functions in the formation of COPI-type, non-clathrin-coated vesicles. COPI vesicles function in the retrograde Golgi-to-ER transport of dilysine-tagged proteins. | ENSG00000005243 | coatomer protein complex subunit zeta 2 | 51226 | COPZ2 | NA |
| This gene encodes one of the three alpha chains of type VI collagen, a beaded filament collagen found in most connective tissues. The product of this gene contains several domains similar to von Willebrand Factor type A domains. These domains have been shown to bind extracellular matrix proteins, an interaction that explains the importance of this collagen in organizing matrix components. Mutations in this gene are associated with Bethlem myopathy and Ullrich scleroatonic muscular dystrophy. Three transcript variants have been identified for this gene. | ENSG00000142173 | collagen type VI alpha 2 | 1292 | COL6A2 | NA |
| This gene encodes a member of the arm-repeat (armadillo) and plakophilin gene families. Plakophilin proteins contain numerous armadillo repeats, localize to cell desmosomes and nuclei, and participate in linking cadherins to intermediate filaments in the cytoskeleton. This gene product may regulate the signaling activity of beta-catenin. Two alternately spliced transcripts encoding two protein isoforms have been identified. A processed pseudogene with high similarity to this locus has been mapped to chromosome 12p13. | ENSG00000057294 | plakophilin 2 | 5318 | PKP2 | NA |
| NA | ENSG00000185697 | MYB proto-oncogene like 1 | 4603 | MYBL1 | NA |
| NUSAP1 is a nucleolar-spindle-associated protein that plays a role in spindle microtubule organization (Raemaekers et al., 2003 [PubMed 12963707]). | ENSG00000137804 | nucleolar and spindle associated protein 1 | 51203 | NUSAP1 | NA |
| This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | ENSG00000204983 | protease, serine 1 | 5644 | PRSS1 | NA |
| NA | ENSG00000111665 | cell division cycle associated 3 | 83461 | CDCA3 | NA |
| Cyclin B2 is a member of the cyclin family, specifically the B-type cyclins. The B-type cyclins, B1 and B2, associate with p34cdc2 and are essential components of the cell cycle regulatory machinery. B1 and B2 differ in their subcellular localization. Cyclin B1 co-localizes with microtubules, whereas cyclin B2 is primarily associated with the Golgi region. Cyclin B2 also binds to transforming growth factor beta RII and thus cyclin B2/cdc2 may play a key role in transforming growth factor beta-mediated cell cycle control. | ENSG00000157456 | cyclin B2 | 9133 | CCNB2 | NA |
| The protein encoded by this gene belongs to the flavoprotein pyridine nucleotide cytochrome reductase family of proteins. Cytochrome b-type NAD(P)H oxidoreductases are implicated in many processes including cholesterol biosynthesis, fatty acid desaturation and elongation, and respiratory burst in neutrophils and macrophages. Cytochrome b5 reductases have soluble and membrane-bound forms that are the product of alternative splicing. In animal cells, the membrane-bound form binds to the endoplasmic reticulum, where it is a member of a fatty acid desaturation complex. Alternative splicing results in multiple transcript variants. | ENSG00000166394 | cytochrome b5 reductase 2 | 51700 | CYB5R2 | NA |
| Tryptases comprise a family of trypsin-like serine proteases, the peptidase family S1. Tryptases are enzymatically active only as heparin-stabilized tetramers, and they are resistant to all known endogenous proteinase inhibitors. Several tryptase genes are clustered on chromosome 16p13.3. These genes are characterized by several distinct features. They have a highly conserved 3’ UTR and contain tandem repeat sequences at the 5’ flank and 3’ UTR which are thought to play a role in regulation of the mRNA stability. These genes have an intron immediately upstream of the initiator Met codon, which separates the site of transcription initiation from protein coding sequence. This feature is characteristic of tryptases but is unusual in other genes. The alleles of this gene exhibit an unusual amount of sequence variation, such that the alleles were once thought to represent two separate genes, alpha and beta 1. Beta tryptases appear to be the main isoenzymes expressed in mast cells; whereas in basophils, alpha tryptases predominate. Tryptases have been implicated as mediators in the pathogenesis of asthma and other allergic and inflammatory disorders. | ENSG00000172236 | tryptase alpha/beta 1 | 7177 | TPSAB1 | NA |
| NA | ENSG00000150636 | coiled-coil domain containing 102B | 79839 | CCDC102B | NA |
| This gene is a member of the matrix metalloproteinase (MMP) gene family, that are zinc-dependent enzymes capable of cleaving components of the extracellular matrix and molecules involved in signal transduction. The protein encoded by this gene is a gelatinase A, type IV collagenase, that contains three fibronectin type II repeats in its catalytic site that allow binding of denatured type IV and V collagen and elastin. Unlike most MMP family members, activation of this protein can occur on the cell membrane. This enzyme can be activated extracellularly by proteases, or, intracellulary by its S-glutathiolation with no requirement for proteolytical removal of the pro-domain. This protein is thought to be involved in multiple pathways including roles in the nervous system, endometrial menstrual breakdown, regulation of vascularization, and metastasis. Mutations in this gene have been associated with Winchester syndrome and Nodulosis-Arthropathy-Osteolysis (NAO) syndrome. Alternative splicing results in multiple transcript variants encoding different isoforms. | ENSG00000087245 | matrix metallopeptidase 2 | 4313 | MMP2 | NA |
| NA | ENSG00000168876 | ankyrin repeat domain 49 | 54851 | ANKRD49 | NA |
| NA | ENSG00000224729 | PCOLCE antisense RNA 1 | 100129845 | PCOLCE-AS1 | NA |
| NA | ENSG00000272016 | NA | NA | NA | TRUE |
| This gene encodes a member of the mannose receptor family of proteins that contain a fibronectin type II domain and multiple C-type lectin-like domains. The encoded protein plays a role in extracellular matrix remodeling by mediating the internalization and lysosomal degradation of collagen ligands. Expression of this gene may play a role in the tumorigenesis and metastasis of several malignancies including breast cancer, gliomas and metastatic bone disease. | ENSG00000011028 | mannose receptor C type 2 | 9902 | MRC2 | NA |
| This gene is a member of the RUNX family of transcription factors and encodes a nuclear protein with an Runt DNA-binding domain. This protein is essential for osteoblastic differentiation and skeletal morphogenesis and acts as a scaffold for nucleic acids and regulatory factors involved in skeletal gene expression. The protein can bind DNA both as a monomer or, with more affinity, as a subunit of a heterodimeric complex. Mutations in this gene have been associated with the bone development disorder cleidocranial dysplasia (CCD). Transcript variants that encode different protein isoforms result from the use of alternate promoters as well as alternate splicing. | ENSG00000124813 | runt related transcription factor 2 | 860 | RUNX2 | NA |
| The protein encoded by this gene is a secreted, extracellular matrix protein containing an Arg-Gly-Asp (RGD) motif and calcium-binding EGF-like domains. It promotes adhesion of endothelial cells through interaction of integrins and the RGD motif. It is prominently expressed in developing arteries but less so in adult vessels. However, its expression is reinduced in balloon-injured vessels and atherosclerotic lesions, notably in intimal vascular smooth muscle cells and endothelial cells. Therefore, the protein encoded by this gene may play a role in vascular development and remodeling. Defects in this gene are a cause of autosomal dominant cutis laxa, autosomal recessive cutis laxa type I (CL type I), and age-related macular degeneration type 3 (ARMD3). | ENSG00000140092 | fibulin 5 | 10516 | FBLN5 | NA |
| Protein disulfide isomerases (EC 5.3.4.1), such as PDIP, are endoplasmic reticulum (ER) resident proteins that catalyze protein folding and thiol-disulfide interchange reactions (Desilva et al., 1996 [PubMed 8561901]). | ENSG00000185615 | protein disulfide isomerase family A member 2 | 64714 | PDIA2 | NA |
| This gene is a member of a group of genes whose transcript levels are increased following stressful growth arrest conditions and treatment with DNA-damaging agents. The induction of this gene by ionizing radiation occurs in certain cell lines regardless of p53 status, and its protein response is correlated with apoptosis following ionizing radiation. | ENSG00000087074 | protein phosphatase 1 regulatory subunit 15A | 23645 | PPP1R15A | NA |
| NA | ENSG00000162878 | protein kinase domain containing, cytoplasmic | 91461 | PKDCC | NA |
| NA | ENSG00000249835 | VCAN antisense RNA 1 | ENSG00000249835 | VCAN-AS1 | NA |
| NA | ENSG00000095203 | erythrocyte membrane protein band 4.1 like 4B | 54566 | EPB41L4B | NA |
| NA | ENSG00000088325 | TPX2, microtubule nucleation factor | 22974 | TPX2 | NA |
| The leucine-rich repeat (LRR) family of proteins, including LRG1, have been shown to be involved in protein-protein interaction, signal transduction, and cell adhesion and development. LRG1 is expressed during granulocyte differentiation (O’Donnell et al., 2002 [PubMed 12223515]). | ENSG00000171236 | leucine rich alpha-2-glycoprotein 1 | 116844 | LRG1 | NA |
| This gene encodes a gamma-carboxyglutamic acid (Gla)-containing protein thought to be involved in the stimulation of cell proliferation. This gene is frequently overexpressed in many cancers and has been implicated as an adverse prognostic marker. Elevated protein levels are additionally associated with a variety of disease states, including venous thromboembolic disease, systemic lupus erythematosus, chronic renal failure, and preeclampsia. | ENSG00000183087 | growth arrest specific 6 | 2621 | GAS6 | NA |
| Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a muscle-specific, alpha actinin isoform that is expressed in both skeletal and cardiac muscles. Several transcript variants encoding different isoforms have been found for this gene. | ENSG00000077522 | actinin alpha 2 | 88 | ACTN2 | NA |
| NA | ENSG00000260296 | NA | ENSG00000260296 | RP11-395I6.3 | NA |
| This gene represents a member of the formin family of proteins. It is considered a diaphanous formin due to the presence of a diaphanous inhibitory domain located at the N-terminus of the encoded protein. Studies of a similar mouse protein indicate that the protein encoded by this locus may function in polymerization and depolymerization of actin filaments. Mutations at this locus have been associated with focal segmental glomerulosclerosis 5. | ENSG00000203485 | inverted formin, FH2 and WH2 domain containing | 64423 | INF2 | NA |
| NA | ENSG00000121690 | DEP domain containing 7 | 91614 | DEPDC7 | NA |
| The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. | ENSG00000106537 | tetraspanin 13 | 27075 | TSPAN13 | NA |
| This gene encodes a member of the fibrillin family of proteins. The encoded preproprotein is proteolytically processed to generate two proteins including the extracellular matrix component fibrillin-1 and the protein hormone asprosin. Fibrillin-1 is an extracellular matrix glycoprotein that serves as a structural component of calcium-binding microfibrils. These microfibrils provide force-bearing structural support in elastic and nonelastic connective tissue throughout the body. Asprosin, secreted by white adipose tissue, has been shown to regulate glucose homeostasis. Mutations in this gene are associated with Marfan syndrome and the related MASS phenotype, as well as ectopia lentis syndrome, Weill-Marchesani syndrome, Shprintzen-Goldberg syndrome and neonatal progeroid syndrome. | ENSG00000166147 | fibrillin 1 | 2200 | FBN1 | NA |
| Sterile alpha motifs (SAMs) in proteins such as SAMD4A are part of an RNA-binding domain that functions as a posttranscriptional regulator by binding to an RNA sequence motif known as the Smaug recognition element, which was named after the Drosophila Smaug protein (Baez and Boccaccio, 2005 [PubMed 16221671]). | ENSG00000020577 | sterile alpha motif domain containing 4A | 23034 | SAMD4A | NA |
| NA | ENSG00000204219 | transcription elongation factor A3 | 6920 | TCEA3 | NA |
| Prostaglandin-endoperoxide synthase (PTGS), also known as cyclooxygenase, is the key enzyme in prostaglandin biosynthesis, and acts both as a dioxygenase and as a peroxidase. There are two isozymes of PTGS: a constitutive PTGS1 and an inducible PTGS2, which differ in their regulation of expression and tissue distribution. This gene encodes the inducible isozyme. It is regulated by specific stimulatory events, suggesting that it is responsible for the prostanoid biosynthesis involved in inflammation and mitogenesis. | ENSG00000073756 | prostaglandin-endoperoxide synthase 2 | 5743 | PTGS2 | NA |
| NA | ENSG00000255443 | NA | ENSG00000255443 | RP1-68D18.4 | NA |
| The protein encoded by this gene is a transcription factor with three tandem C2H2-type zinc fingers. Defects in this gene are associated with Charcot-Marie-Tooth disease type 1D (CMT1D), Charcot-Marie-Tooth disease type 4E (CMT4E), and with Dejerine-Sottas syndrome (DSS). Multiple transcript variants encoding two different isoforms have been found for this gene. | ENSG00000122877 | early growth response 2 | 1959 | EGR2 | NA |
| NA | ENSG00000168928 | chymotrypsinogen B2 | 440387 | CTRB2 | NA |
| This gene encodes the vitamin K-dependent coagulation factor X of the blood coagulation cascade. This factor undergoes multiple processing steps before its preproprotein is converted to a mature two-chain form by the excision of the tripeptide RKR. Two chains of the factor are held together by 1 or more disulfide bonds; the light chain contains 2 EGF-like domains, while the heavy chain contains the catalytic domain which is structurally homologous to those of the other hemostatic serine proteases. The mature factor is activated by the cleavage of the activation peptide by factor IXa (in the intrisic pathway), or by factor VIIa (in the extrinsic pathway). The activated factor then converts prothrombin to thrombin in the presence of factor Va, Ca+2, and phospholipid during blood clotting. Mutations of this gene result in factor X deficiency, a hemorrhagic condition of variable severity. Alternative splicing results in multiple transcript variants encoding different isoforms that may undergo similar proteolytic processing to generate mature polypeptides. | ENSG00000126218 | coagulation factor X | 2159 | F10 | NA |
| NA | ENSG00000261542 | NA | ENSG00000261542 | RP11-16E18.3 | NA |
| NA | ENSG00000188707 | ZBED6 C-terminal like | 113763 | ZBED6CL | NA |
| The protein encoded by this gene is a cell cycle-regulated kinase that appears to be involved in microtubule formation and/or stabilization at the spindle pole during chromosome segregation. The encoded protein is found at the centrosome in interphase cells and at the spindle poles in mitosis. This gene may play a role in tumor development and progression. A processed pseudogene of this gene has been found on chromosome 1, and an unprocessed pseudogene has been found on chromosome 10. Multiple transcript variants encoding the same protein have been found for this gene. | ENSG00000087586 | aurora kinase A | 6790 | AURKA | NA |
| The Shaker gene family of Drosophila encodes components of voltage-gated potassium channels and is comprised of four subfamilies. Based on sequence similarity, this gene is similar to one of these subfamilies, namely the Shaw subfamily. The protein encoded by this gene belongs to the delayed rectifier class of channel proteins and is an integral membrane protein that mediates the voltage-dependent potassium ion permeability of excitable membranes. Alternate splicing results in several transcript variants. | ENSG00000131398 | potassium voltage-gated channel subfamily C member 3 | 3748 | KCNC3 | NA |
| APOLD1 is an endothelial cell early response protein that may play a role in regulation of endothelial cell signaling and vascular function (Regard et al., 2004 [PubMed 15102925]). | ENSG00000178878 | apolipoprotein L domain containing 1 | 81575 | APOLD1 | NA |
| The protein encoded by this gene is a secretory protein that contains a hyaluronan-binding domain, and thus is a member of the hyaluronan-binding protein family. The hyaluronan-binding domain is known to be involved in extracellular matrix stability and cell migration. This protein has been shown to form a stable complex with inter-alpha-inhibitor (I alpha I), and thus enhance the serine protease inhibitory activity of I alpha I, which is important in the protease network associated with inflammation. This gene can be induced by proinflammatory cytokines such as tumor necrosis factor alpha and interleukin-1. Enhanced levels of this protein are found in the synovial fluid of patients with osteoarthritis and rheumatoid arthritis. | ENSG00000123610 | TNF alpha induced protein 6 | 7130 | TNFAIP6 | NA |
| Myosin is a hexameric ATPase cellular motor protein. It is composed of two myosin heavy chains, two nonphosphorylatable myosin alkali light chains, and two phosphorylatable myosin regulatory light chains. This gene encodes a myosin alkali light chain that is found in embryonic muscle and adult atria. Two alternatively spliced transcript variants encoding the same protein have been found for this gene. | ENSG00000198336 | myosin light chain 4 | 4635 | MYL4 | NA |
| This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum. It has both 17alpha-hydroxylase and 17,20-lyase activities and is a key enzyme in the steroidogenic pathway that produces progestins, mineralocorticoids, glucocorticoids, androgens, and estrogens. Mutations in this gene are associated with isolated steroid-17 alpha-hydroxylase deficiency, 17-alpha-hydroxylase/17,20-lyase deficiency, pseudohermaphroditism, and adrenal hyperplasia. | ENSG00000148795 | cytochrome P450 family 17 subfamily A member 1 | 1586 | CYP17A1 | NA |
| NA | ENSG00000175087 | PDLIM1 interacting kinase 1 like | 149420 | PDIK1L | NA |
| The protein encoded by this gene belongs to the calcium channel beta subunit family. It plays an important role in the calcium channel by modulating G protein inhibition, increasing peak calcium current, controlling the alpha-1 subunit membrane targeting and shifting the voltage dependence of activation and inactivation. Alternative splicing occurs at this locus and three transcript variants encoding three distinct isoforms have been identified. | ENSG00000067191 | calcium voltage-gated channel auxiliary subunit beta 1 | 782 | CACNB1 | NA |
| NA | ENSG00000246082 | nudix hydrolase 16 pseudogene 1 | 152195 | NUDT16P1 | NA |
| NA | ENSG00000168274 | NA | NA | NA | TRUE |
| This gene encodes a member of the fascin family of actin-binding proteins. Fascin proteins organize F-actin into parallel bundles, and are required for the formation of actin-based cellular protrusions. The encoded protein plays a critical role in cell migration, motility, adhesion and cellular interactions. Expression of this gene is known to be regulated by several microRNAs, and overexpression of this gene may play a role in the metastasis of multiple types of cancer by increasing cell motility. Expression of this gene is also a marker for Reed-Sternberg cells in Hodgkin’s lymphoma. A pseudogene of this gene is located on the long arm of chromosome 15. | ENSG00000075618 | fascin actin-bundling protein 1 | 6624 | FSCN1 | NA |
| NA | ENSG00000135362 | proline rich 5 like | 79899 | PRR5L | NA |
| Chondroadherin is a cartilage matrix protein thought to mediate adhesion of isolated chondrocytes. The protein contains 11 leucine-rich repeats flanked by cysteine-rich regions. The chondroadherin messenger RNA is present in chondrocytes at all ages. | ENSG00000136457 | chondroadherin | 1101 | CHAD | NA |
| NA | ENSG00000168389 | major facilitator superfamily domain containing 2A | 84879 | MFSD2A | NA |
| NA | ENSG00000247134 | NA | ENSG00000247134 | RP11-11N9.4 | NA |
| The inhibin beta A subunit joins the alpha subunit to form a pituitary FSH secretion inhibitor. Inhibin has been shown to regulate gonadal stromal cell proliferation negatively and to have tumor-suppressor activity. In addition, serum levels of inhibin have been shown to reflect the size of granulosa-cell tumors and can therefore be used as a marker for primary as well as recurrent disease. Because expression in gonadal and various extragonadal tissues may vary severalfold in a tissue-specific fashion, it is proposed that inhibin may be both a growth/differentiation factor and a hormone. Furthermore, the beta A subunit forms a homodimer, activin A, and also joins with a beta B subunit to form a heterodimer, activin AB, both of which stimulate FSH secretion. Finally, it has been shown that the beta A subunit mRNA is identical to the erythroid differentiation factor subunit mRNA and that only one gene for this mRNA exists in the human genome. | ENSG00000122641 | inhibin beta A subunit | 3624 | INHBA | NA |
| NA | ENSG00000124701 | apolipoprotein B mRNA editing enzyme catalytic subunit 2 | 10930 | APOBEC2 | NA |
| NA | ENSG00000155363 | Mov10 RISC complex RNA helicase | 4343 | MOV10 | NA |
| NA | ENSG00000182902 | solute carrier family 25 member 18 | 83733 | SLC25A18 | NA |
| This gene encodes a member of the steroid-thyroid hormone-retinoid receptor superfamily. Expression is induced by phytohemagglutinin in human lymphocytes and by serum stimulation of arrested fibroblasts. The encoded protein acts as a nuclear transcription factor. Translocation of the protein from the nucleus to mitochondria induces apoptosis. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000123358 | nuclear receptor subfamily 4 group A member 1 | 3164 | NR4A1 | NA |
| This gene encodes an adenylate kinase enzyme involved in energy metabolism and homeostasis of cellular adenine nucleotide ratios in different intracellular compartments. This gene is highly expressed in skeletal muscle, brain and erythrocytes. Certain mutations in this gene resulting in a functionally inadequate enzyme are associated with a rare genetic disorder causing nonspherocytic hemolytic anemia. Alternative splicing of this gene results in multiple transcript variants encoding different isoforms. | ENSG00000106992 | adenylate kinase 1 | 203 | AK1 | NA |
| NA | ENSG00000258782 | NA | ENSG00000258782 | RP11-701B16.2 | NA |
| NA | ENSG00000235092 | ID2 antisense RNA 1 (head to head) | 100506299 | ID2-AS1 | NA |
| NA | ENSG00000142765 | synaptotagmin like 1 | 84958 | SYTL1 | NA |
| Transglutaminases are enzymes that catalyze the crosslinking of proteins by epsilon-gamma glutamyl lysine isopeptide bonds. While the primary structure of transglutaminases is not conserved, they all have the same amino acid sequence at their active sites and their activity is calcium-dependent. The protein encoded by this gene consists of two polypeptide chains activated from a single precursor protein by proteolysis. The encoded protein is involved the later stages of cell envelope formation in the epidermis and hair follicle. | ENSG00000125780 | transglutaminase 3 | 7053 | TGM3 | NA |
| NA | ENSG00000225075 | NA | ENSG00000225075 | RP11-426L16.3 | NA |
| This gene encodes a member of the dedicator of cytokinesis protein family. Members of this family are guanosine nucleotide exchange factors for Rho GTPases and defined by the presence of conserved DOCK-homology regions. The encoded protein belongs to the D (or Zizimin) subfamily of DOCK proteins, which also contain an N-terminal pleckstrin homology domain. Alternatively spliced transcript variants that encode different isoforms have been described. | ENSG00000135905 | dedicator of cytokinesis 10 | 55619 | DOCK10 | NA |
| This gene encodes a member of the serpin family of serine protease inhibitors. The protein is a major inhibitor of plasmin, which degrades fibrin and various other proteins. Consequently, the proper function of this gene has a major role in regulating the blood clotting pathway. Mutations in this gene result in alpha-2-plasmin inhibitor deficiency, which is characterized by severe hemorrhagic diathesis. Multiple transcript variants encoding different isoforms have been found for this gene. | ENSG00000167711 | serpin family F member 2 | 5345 | SERPINF2 | NA |
| The protein encoded by this gene is a glutathione-dependent prostaglandin E synthase. The expression of this gene has been shown to be induced by proinflammatory cytokine interleukin 1 beta (IL1B). Its expression can also be induced by tumor suppressor protein TP53, and may be involved in TP53 induced apoptosis. Knockout studies in mice suggest that this gene may contribute to the pathogenesis of collagen-induced arthritis and mediate acute pain during inflammatory responses. | ENSG00000148344 | prostaglandin E synthase | 9536 | PTGES | NA |
| Troponin proteins associate with tropomyosin and regulate the calcium sensitivity of the myofibril contractile apparatus of striated muscles. Troponin I (TnI), along with troponin T (TnT) and troponin C (TnC), is one of 3 subunits that form the troponin complex of the thin filaments of striated muscle. TnI is the inhibitory subunit; blocking actin-myosin interactions and thereby mediating striated muscle relaxation. The TnI subfamily contains three genes: TnI-skeletal-fast-twitch, TnI-skeletal-slow-twitch, and TnI-cardiac. The TnI-fast and TnI-slow genes are expressed in fast-twitch and slow-twitch skeletal muscle fibers, respectively, while the TnI-cardiac gene is expressed exclusively in cardiac muscle tissue. This gene encodes the Troponin-I-skeletal-slow-twitch protein. This gene is expressed in cardiac and skeletal muscle during early development but is restricted to slow-twitch skeletal muscle fibers in adults. The encoded protein prevents muscle contraction by inhibiting calcium-mediated conformational changes in actin-myosin complexes. | ENSG00000159173 | troponin I1, slow skeletal type | 7135 | TNNI1 | NA |
| NA | ENSG00000237773 | NA | ENSG00000237773 | AC003075.4 | NA |
| The protein encoded by this gene is a mitochondrial phosphate-activated glutaminase that catalyzes the hydrolysis of glutamine to stoichiometric amounts of glutamate and ammonia. Originally thought to be liver-specific, this protein has been found in other tissues as well. Alternative splicing results in multiple transcript variants that encode different isoforms. | ENSG00000135423 | glutaminase 2 | 27165 | GLS2 | NA |
| NA | ENSG00000259088 | NA | ENSG00000259088 | CTD-2017C7.2 | NA |
| NA | ENSG00000213149 | calponin 2 pseudogene 9 | ENSG00000213149 | CNN2P9 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",12,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[13,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
kable(as.data.frame(out))
| symbol | X_id | name | query | summary |
|---|---|---|---|---|
| LOR | 4014 | loricrin | ENSG00000203782 | This gene encodes loricrin, a major protein component of the cornified cell envelope found in terminally differentiated epidermal cells. Mutations in this gene are associated with Vohwinkel’s syndrome and progressive symmetric erythrokeratoderma, both inherited skin diseases. |
| KRT2 | 3849 | keratin 2 | ENSG00000172867 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. |
| CDHR1 | 92211 | cadherin related family member 1 | ENSG00000148600 | This gene belongs to the cadherin superfamily of calcium-dependent cell adhesion molecules. The encoded protein is a photoreceptor-specific cadherin that plays a role in outer segment disc morphogenesis. Mutations in this gene are associated with inherited retinal dystrophies. Alternatively spliced transcript variants encoding different isoforms have been identified. |
| TNNT2 | 7139 | troponin T2, cardiac type | ENSG00000118194 | The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. Mutations in this gene have been associated with familial hypertrophic cardiomyopathy as well as with dilated cardiomyopathy. Transcripts for this gene undergo alternative splicing that results in many tissue-specific isoforms, however, the full-length nature of some of these variants has not yet been determined. |
| DMKN | 93099 | dermokine | ENSG00000161249 | This gene is upregulated in inflammatory diseases, and it was first observed as expressed in the differentiated layers of skin. The most interesting aspect of this gene is the differential use of promoters and terminators to generate isoforms with unique cellular distributions and domain components. Alternatively spliced transcript variants encoding different isoforms have been identified for this gene. |
| S100A14 | 57402 | S100 calcium binding protein A14 | ENSG00000189334 | This gene encodes a member of the S100 protein family which contains an EF-hand motif and binds calcium. The gene is located in a cluster of S100 genes on chromosome 1. Levels of the encoded protein have been found to be lower in cancerous tissue and associated with metastasis suggesting a tumor suppressor function (PMID: 19956863, 19351828). |
| PKP3 | 11187 | plakophilin 3 | ENSG00000184363 | This gene encodes a member of the arm-repeat (armadillo) and plakophilin gene families. Plakophilin proteins contain numerous armadillo repeats, localize to cell desmosomes and nuclei, and participate in linking cadherins to intermediate filaments in the cytoskeleton. This protein may act in cellular desmosome-dependent adhesion and signaling pathways. Two transcript variants encoding different isoforms have been found for this gene. |
| MUCL1 | 118430 | mucin like 1 | ENSG00000172551 | NA |
| SAA1 | 6288 | serum amyloid A1 | ENSG00000173432 | This gene encodes a member of the serum amyloid A family of apolipoproteins. The encoded preproprotein is proteolytically processed to generate the mature protein. This protein is a major acute phase protein that is highly expressed in response to inflammation and tissue injury. This protein also plays an important role in HDL metabolism and cholesterol homeostasis. High levels of this protein are associated with chronic inflammatory diseases including atherosclerosis, rheumatoid arthritis, Alzheimer’s disease and Crohn’s disease. This protein may also be a potential biomarker for certain tumors. Alternate splicing results in multiple transcript variants that encode the same protein. A pseudogene of this gene is found on chromosome 11. |
| KRT1 | 3848 | keratin 1 | ENSG00000167768 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. |
| DCD | 117159 | dermcidin | ENSG00000161634 | This antimicrobial gene encodes a secreted protein that is subsequently processed into mature peptides of distinct biological activities. The C-terminal peptide is constitutively expressed in sweat and has antibacterial and antifungal activities. The N-terminal peptide, also known as diffusible survival evasion peptide, promotes neural cell survival under conditions of severe oxidative stress. A glycosylated form of the N-terminal peptide may be associated with cachexia (muscle wasting) in cancer patients. Alternative splicing results in multiple transcript variants encoding different isoforms. |
| NEBL | 10529 | nebulette | ENSG00000078114 | This gene encodes a nebulin like protein that is abundantly expressed in cardiac muscle. The encoded protein binds actin and interacts with thin filaments and Z-line associated proteins in striated muscle. This protein may be involved in cardiac myofibril assembly. A shorter isoform of this protein termed LIM nebulette is expressed in non-muscle cells and may function as a component of focal adhesion complexes. Alternate splicing results in multiple transcript variants. |
| MYCL | 4610 | v-myc avian myelocytomatosis viral oncogene lung carcinoma derived homolog | ENSG00000116990 | NA |
| PCK1 | 5105 | phosphoenolpyruvate carboxykinase 1 | ENSG00000124253 | This gene is a main control point for the regulation of gluconeogenesis. The cytosolic enzyme encoded by this gene, along with GTP, catalyzes the formation of phosphoenolpyruvate from oxaloacetate, with the release of carbon dioxide and GDP. The expression of this gene can be regulated by insulin, glucocorticoids, glucagon, cAMP, and diet. Defects in this gene are a cause of cytosolic phosphoenolpyruvate carboxykinase deficiency. A mitochondrial isozyme of the encoded protein also has been characterized. |
| CALML5 | 51806 | calmodulin like 5 | ENSG00000178372 | This gene encodes a novel calcium binding protein expressed in the epidermis and related to the calmodulin family of calcium binding proteins. Functional studies with recombinant protein demonstrate it does bind calcium and undergoes a conformational change when it does so. Abundant expression is detected only in reconstructed epidermis and is restricted to differentiating keratinocytes. In addition, it can associate with transglutaminase 3, shown to be a key enzyme in the terminal differentiation of keratinocytes. |
| C3orf52 | 79669 | chromosome 3 open reading frame 52 | ENSG00000114529 | NA |
| THEM5 | 284486 | thioesterase superfamily member 5 | ENSG00000196407 | NA |
| KRTDAP | 388533 | keratinocyte differentiation associated protein | ENSG00000188508 | This gene encodes a protein which may function in the regulation of keratinocyte differentiation and maintenance of stratified epithelia. Multiple transcript variants encoding different isoforms have been found for this gene. |
| SAA2-SAA4 | 100528017 | SAA2-SAA4 readthrough | ENSG00000255071 | This locus represents naturally occurring read-through transcription between the neighboring serum amyloid A2 and serum amyloid A4 genes on chromosome 11. The read-through transcript produces a fusion protein that shares sequence identity with each individual gene product. |
| SAA2 | 6289 | serum amyloid A2 | ENSG00000134339 | NA |
| HHATL | 57467 | hedgehog acyltransferase-like | ENSG00000010282 | NA |
| RHOV | 171177 | ras homolog family member V | ENSG00000104140 | NA |
| CDH1 | 999 | cadherin 1 | ENSG00000039068 | This gene encodes a classical cadherin of the cadherin superfamily. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate the mature glycoprotein. This calcium-dependent cell-cell adhesion protein is comprised of five extracellular cadherin repeats, a transmembrane region and a highly conserved cytoplasmic tail. Mutations in this gene are correlated with gastric, breast, colorectal, thyroid and ovarian cancer. Loss of function of this gene is thought to contribute to cancer progression by increasing proliferation, invasion, and/or metastasis. The ectodomain of this protein mediates bacterial adhesion to mammalian cells and the cytoplasmic domain is required for internalization. This gene is present in a gene cluster with other members of the cadherin family on chromosome 16. |
| DSP | 1832 | desmoplakin | ENSG00000096696 | This gene encodes a protein that anchors intermediate filaments to desmosomal plaques and forms an obligate component of functional desmosomes. Mutations in this gene are the cause of several cardiomyopathies and keratodermas, including skin fragility-woolly hair syndrome. Alternative splicing results in multiple transcript variants. |
| RAP1GAP | 5909 | RAP1 GTPase activating protein | ENSG00000076864 | This gene encodes a type of GTPase-activating-protein (GAP) that down-regulates the activity of the ras-related RAP1 protein. RAP1 acts as a molecular switch by cycling between an inactive GDP-bound form and an active GTP-bound form. The product of this gene, RAP1GAP, promotes the hydrolysis of bound GTP and hence returns RAP1 to the inactive state whereas other proteins, guanine nucleotide exchange factors (GEFs), act as RAP1 activators by facilitating the conversion of RAP1 from the GDP- to the GTP-bound form. In general, ras subfamily proteins, such as RAP1, play key roles in receptor-linked signaling pathways that control cell growth and differentiation. RAP1 plays a role in diverse processes such as cell proliferation, adhesion, differentiation, and embryogenesis. Alternative splicing results in multiple transcript variants encoding distinct proteins. |
| TNNI3 | 7137 | troponin I3, cardiac type | ENSG00000129991 | Troponin I (TnI), along with troponin T (TnT) and troponin C (TnC), is one of 3 subunits that form the troponin complex of the thin filaments of striated muscle. TnI is the inhibitory subunit; blocking actin-myosin interactions and thereby mediating striated muscle relaxation. The TnI subfamily contains three genes: TnI-skeletal-fast-twitch, TnI-skeletal-slow-twitch, and TnI-cardiac. This gene encodes the TnI-cardiac protein and is exclusively expressed in cardiac muscle tissues. Mutations in this gene cause familial hypertrophic cardiomyopathy type 7 (CMH7) and familial restrictive cardiomyopathy (RCM). |
| PIP | 5304 | prolactin induced protein | ENSG00000159763 | NA |
| EPS8L1 | 54869 | EPS8 like 1 | ENSG00000131037 | This gene encodes a protein that is related to epidermal growth factor receptor pathway substrate 8 (EPS8), a substrate for the epidermal growth factor receptor. The function of this protein is unknown. At least two alternatively spliced transcript variants encoding different isoforms have been found for this gene. |
| ADH1B | 125 | alcohol dehydrogenase 1B (class I), beta polypeptide | ENSG00000196616 | The protein encoded by this gene is a member of the alcohol dehydrogenase family. Members of this enzyme family metabolize a wide variety of substrates, including ethanol, retinol, other aliphatic alcohols, hydroxysteroids, and lipid peroxidation products. This encoded protein, consisting of several homo- and heterodimers of alpha, beta, and gamma subunits, exhibits high activity for ethanol oxidation and plays a major role in ethanol catabolism. Three genes encoding alpha, beta and gamma subunits are tandemly organized in a genomic segment as a gene cluster. Two transcript variants encoding different isoforms have been found for this gene. |
| FNDC4 | 64838 | fibronectin type III domain containing 4 | ENSG00000115226 | NA |
| CTSV | 1515 | cathepsin V | ENSG00000136943 | The protein encoded by this gene, a member of the peptidase C1 family, is a lysosomal cysteine proteinase that may play an important role in corneal physiology. This gene is expressed in colorectal and breast carcinomas but not in normal colon, mammary gland, or peritumoral tissues, suggesting a possible role for this gene in tumor processes. Alternatively spliced variants, encoding the same protein, have been identified. |
| SGPP2 | 130367 | sphingosine-1-phosphate phosphatase 2 | ENSG00000163082 | The protein encoded by this gene is a transmembrane protein that degrades the bioactive signaling molecule sphingosine 1-phosphate. The encoded protein is induced during inflammatory responses and has been shown to be downregulated by the microRNA-31 tumor suppressor. Alternative splice variants encoding different isoforms have been found for this gene. |
| ANKRD1 | 27063 | ankyrin repeat domain 1 | ENSG00000148677 | The protein encoded by this gene is localized to the nucleus of endothelial cells and is induced by IL-1 and TNF-alpha stimulation. Studies in rat cardiomyocytes suggest that this gene functions as a transcription factor. Interactions between this protein and the sarcomeric proteins myopalladin and titin suggest that it may also be involved in the myofibrillar stretch-sensor system. |
| ENO1P1 | ENSG00000244457 | enolase 1, (alpha) pseudogene 1 | ENSG00000244457 | NA |
| KRT10 | 3858 | keratin 10 | ENSG00000186395 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. |
| RAB25 | 57111 | RAB25, member RAS oncogene family | ENSG00000132698 | The protein encoded by this gene is a member of the RAS superfamily of small GTPases. The encoded protein is involved in membrane trafficking and cell survival. This gene has been found to be a tumor suppressor and an oncogene, depending on the context. Two variants, one protein-coding and the other not, have been found for this gene. |
| RP11-517C16.2 | ENSG00000261286 | NA | ENSG00000261286 | NA |
| TNNC1 | 7134 | troponin C1, slow skeletal and cardiac type | ENSG00000114854 | Troponin is a central regulatory protein of striated muscle contraction, and together with tropomyosin, is located on the actin filament. Troponin consists of 3 subunits: TnI, which is the inhibitor of actomyosin ATPase; TnT, which contains the binding site for tropomyosin; and TnC, the protein encoded by this gene. The binding of calcium to TnC abolishes the inhibitory action of TnI, thus allowing the interaction of actin with myosin, the hydrolysis of ATP, and the generation of tension. Mutations in this gene are associated with cardiomyopathy dilated type 1Z. |
| KRT14 | 3861 | keratin 14 | ENSG00000186847 | This gene encodes a member of the keratin family, the most diverse group of intermediate filaments. This gene product, a type I keratin, is usually found as a heterotetramer with two keratin 5 molecules, a type II keratin. Together they form the cytoskeleton of epithelial cells. Mutations in the genes for these keratins are associated with epidermolysis bullosa simplex. At least one pseudogene has been identified at 17p12-p11. |
| PEBP4 | 157310 | phosphatidylethanolamine binding protein 4 | ENSG00000134020 | The phosphatidylethanolamine (PE)-binding proteins, including PEBP4, are an evolutionarily conserved family of proteins with pivotal biologic functions, such as lipid binding and inhibition of serine proteases (Wang et al., 2004 [PubMed 15302887]). |
| AOX1 | 316 | aldehyde oxidase 1 | ENSG00000138356 | Aldehyde oxidase produces hydrogen peroxide and, under certain conditions, can catalyze the formation of superoxide. Aldehyde oxidase is a candidate gene for amyotrophic lateral sclerosis. |
| B4GALNT3 | 283358 | beta-1,4-N-acetyl-galactosaminyltransferase 3 | ENSG00000139044 | B4GALNT3 transfers N-acetylgalactosamine (GalNAc) onto glucosyl residues to form N,N-prime-diacetyllactosediamine (LacdiNAc, or LDN), a unique terminal structure of cell surface N-glycans (Ikehara et al., 2006 [PubMed 16728562]). |
| TNNT1 | 7138 | troponin T1, slow skeletal type | ENSG00000105048 | This gene encodes a protein that is a subunit of troponin, which is a regulatory complex located on the thin filament of the sarcomere. This complex regulates striated muscle contraction in response to fluctuations in intracellular calcium concentration. This complex is composed of three subunits: troponin C, which binds calcium, troponin T, which binds tropomyosin, and troponin I, which is an inhibitory subunit. This protein is the slow skeletal troponin T subunit. Mutations in this gene cause nemaline myopathy type 5, also known as Amish nemaline myopathy, a neuromuscular disorder characterized by muscle weakness and rod-shaped, or nemaline, inclusions in skeletal muscle fibers which affects infants, resulting in death due to respiratory insufficiency, usually in the second year. Multiple transcript variants encoding different isoforms have been found for this gene. |
| SBSN | 374897 | suprabasin | ENSG00000189001 | NA |
| RP11-229P13.23 | ENSG00000231864 | NA | ENSG00000231864 | NA |
| PRSS8 | 5652 | protease, serine 8 | ENSG00000052344 | This gene encodes a member of the peptidase S1 or chymotrypsin family of serine proteases. The encoded preproprotein is proteolytically processed to generate light and heavy chains that associate via a disulfide bond to form the heterodimeric enzyme. This enzyme is highly expressed in prostate epithelia and is one of several proteolytic enzymes found in seminal fluid. This protease exhibits trypsin-like substrate specificity, cleaving protein substrates at the carboxyl terminus of lysine or arginine residues. The encoded protease partially mediates proteolytic activation of the epithelial sodium channel, a regulator of sodium balance, and may also play a role in epithelial barrier formation. |
| NRAP | 4892 | nebulin related anchoring protein | ENSG00000197893 | NA |
| CAMK2B | 816 | calcium/calmodulin dependent protein kinase II beta | ENSG00000058404 | The product of this gene belongs to the serine/threonine protein kinase family and to the Ca(2+)/calmodulin-dependent protein kinase subfamily. Calcium signaling is crucial for several aspects of plasticity at glutamatergic synapses. In mammalian cells, the enzyme is composed of four different chains: alpha, beta, gamma, and delta. The product of this gene is a beta chain. It is possible that distinct isoforms of this chain have different cellular localizations and interact differently with calmodulin. Alternative splicing results in multiple transcript variants. |
| CTD-2201G16.1 | ENSG00000258444 | NA | ENSG00000258444 | NA |
| SOX9 | 6662 | SRY-box 9 | ENSG00000125398 | The protein encoded by this gene recognizes the sequence CCTTGAG along with other members of the HMG-box class DNA-binding proteins. It acts during chondrocyte differentiation and, with steroidogenic factor 1, regulates transcription of the anti-Muellerian hormone (AMH) gene. Deficiencies lead to the skeletal malformation syndrome campomelic dysplasia, frequently with sex reversal. |
| FAM83H | 286077 | family with sequence similarity 83 member H | ENSG00000180921 | The protein encoded by this gene plays an important role in the structural development and calcification of tooth enamel. Defects in this gene are a cause of amelogenesis imperfecta type 3 (AI3). |
| SULT2B1 | 6820 | sulfotransferase family 2B member 1 | ENSG00000088002 | Sulfotransferase enzymes catalyze the sulfate conjugation of many hormones, neurotransmitters, drugs, and xenobiotic compounds. These cytosolic enzymes are different in their tissue distributions and substrate specificities. The gene structure (number and length of exons) is similar among family members. This gene sulfates dehydroepiandrosterone but not 4-nitrophenol, a typical substrate for the phenol and estrogen sulfotransferase subfamilies. Two alternatively spliced variants that encode different isoforms have been described. |
| PFKFB2 | 5208 | 6-phosphofructo-2-kinase/fructose-2,6-biphosphatase 2 | ENSG00000123836 | The protein encoded by this gene is involved in both the synthesis and degradation of fructose-2,6-bisphosphate, a regulatory molecule that controls glycolysis in eukaryotes. The encoded protein has a 6-phosphofructo-2-kinase activity that catalyzes the synthesis of fructose-2,6-bisphosphate, and a fructose-2,6-biphosphatase activity that catalyzes the degradation of fructose-2,6-bisphosphate. This protein regulates fructose-2,6-bisphosphate levels in the heart, while a related enzyme encoded by a different gene regulates fructose-2,6-bisphosphate levels in the liver and muscle. This enzyme functions as a homodimer. Two transcript variants encoding two different isoforms have been found for this gene. |
| SAPCD2 | 89958 | suppressor APC domain containing 2 | ENSG00000186193 | NA |
| LMO7 | 4008 | LIM domain 7 | ENSG00000136153 | This gene encodes a protein containing a calponin homology (CH) domain, a PDZ domain, and a LIM domain, and may be involved in protein-protein interactions. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene, however, the full-length nature of some variants is not known. |
| MYH7 | 4625 | myosin, heavy chain 7, cardiac muscle, beta | ENSG00000092054 | Muscle myosin is a hexameric protein containing 2 heavy chain subunits, 2 alkali light chain subunits, and 2 regulatory light chain subunits. This gene encodes the beta (or slow) heavy chain subunit of cardiac myosin. It is expressed predominantly in normal human ventricle. It is also expressed in skeletal muscle tissues rich in slow-twitch type I muscle fibers. Changes in the relative abundance of this protein and the alpha (or fast) heavy subunit of cardiac myosin correlate with the contractile velocity of cardiac muscle. Its expression is also altered during thyroid hormone depletion and hemodynamic overloading. Mutations in this gene are associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and Laing early-onset distal myopathy. |
| PRKCZ | 5590 | protein kinase C zeta | ENSG00000067606 | Protein kinase C (PKC) zeta is a member of the PKC family of serine/threonine kinases which are involved in a variety of cellular processes such as proliferation, differentiation and secretion. Unlike the classical PKC isoenzymes which are calcium-dependent, PKC zeta exhibits a kinase activity which is independent of calcium and diacylglycerol but not of phosphatidylserine. Furthermore, it is insensitive to typical PKC inhibitors and cannot be activated by phorbol ester. Unlike the classical PKC isoenzymes, it has only a single zinc finger module. These structural and biochemical properties indicate that the zeta subspecies is related to, but distinct from other isoenzymes of PKC. Alternative splicing results in multiple transcript variants encoding different isoforms. |
| MB | 4151 | myoglobin | ENSG00000198125 | This gene encodes a member of the globin superfamily and is expressed in skeletal and cardiac muscles. The encoded protein is a haemoprotein contributing to intracellular oxygen storage and transcellular facilitated diffusion of oxygen. At least three alternatively spliced transcript variants encoding the same protein have been reported. |
| FNBP1P1 | ENSG00000257800 | formin binding protein 1 pseudogene 1 | ENSG00000257800 | NA |
| FBXL16 | 146330 | F-box and leucine rich repeat protein 16 | ENSG00000127585 | Members of the F-box protein family, such as FBXL16, are characterized by an approximately 40-amino acid F-box motif. SCF complexes, formed by SKP1 (MIM 601434), cullin (see CUL1; MIM 603134), and F-box proteins, act as protein-ubiquitin ligases. F-box proteins interact with SKP1 through the F box, and they interact with ubiquitination targets through other protein interaction domains (Jin et al., 2004 [PubMed 15520277]). |
| SLC7A11 | 23657 | solute carrier family 7 member 11 | ENSG00000151012 | This gene encodes a member of a heteromeric, sodium-independent, anionic amino acid transport system that is highly specific for cysteine and glutamate. In this system, designated Xc(-), the anionic form of cysteine is transported in exchange for glutamate. This protein has been identified as the predominant mediator of Kaposi sarcoma-associated herpesvirus fusion and entry permissiveness into cells. Also, increased expression of this gene in primary gliomas (compared to normal brain tissue) was associated with increased glutamate secretion via the XCT channels, resulting in neuronal cell death. |
| LY6G6C | 80740 | lymphocyte antigen 6 complex, locus G6C | ENSG00000204421 | LY6G6C belongs to a cluster of leukocyte antigen-6 (LY6) genes located in the major histocompatibility complex (MHC) class III region on chromosome 6. Members of the LY6 superfamily typically contain 70 to 80 amino acids, including 8 to 10 cysteines. Most LY6 proteins are attached to the cell surface by a glycosylphosphatidylinositol (GPI) anchor that is directly involved in signal transduction (Mallya et al., 2002 [PubMed 12079290]). |
| ALDH1A3 | 220 | aldehyde dehydrogenase 1 family member A3 | ENSG00000184254 | This gene encodes an aldehyde dehydrogenase enzyme that uses retinal as a substrate. Mutations in this gene have been associated with microphthalmia, isolated 8, and expression changes have also been detected in tumor cells. Alternative splicing results in multiple transcript variants. |
| SLC2A1 | 6513 | solute carrier family 2 member 1 | ENSG00000117394 | This gene encodes a major glucose transporter in the mammalian blood-brain barrier. The encoded protein is found primarily in the cell membrane and on the cell surface, where it can also function as a receptor for human T-cell leukemia virus (HTLV) I and II. Mutations in this gene have been found in a family with paroxysmal exertion-induced dyskinesia. |
| PLA2G2A | 5320 | phospholipase A2 group IIA | ENSG00000188257 | The protein encoded by this gene is a member of the phospholipase A2 family (PLA2). PLA2s constitute a diverse family of enzymes with respect to sequence, function, localization, and divalent cation requirements. This gene product belongs to group II, which contains secreted form of PLA2, an extracellular enzyme that has a low molecular mass and requires calcium ions for catalysis. It catalyzes the hydrolysis of the sn-2 fatty acid acyl ester bond of phosphoglycerides, releasing free fatty acids and lysophospholipids, and thought to participate in the regulation of the phospholipid metabolism in biomembranes. Several alternatively spliced transcript variants with different 5’ UTRs have been found for this gene. |
| FAM198A | 729085 | family with sequence similarity 198 member A | ENSG00000144649 | NA |
| MT1A | 4489 | metallothionein 1A | ENSG00000205362 | NA |
| MYL2 | 4633 | myosin light chain 2 | ENSG00000111245 | Thus gene encodes the regulatory light chain associated with cardiac myosin beta (or slow) heavy chain. Ca+ triggers the phosphorylation of regulatory light chain that in turn triggers contraction. Mutations in this gene are associated with mid-left ventricular chamber type hypertrophic cardiomyopathy. |
| CLIC3 | 9022 | chloride intracellular channel 3 | ENSG00000169583 | Chloride channels are a diverse group of proteins that regulate fundamental cellular processes including stabilization of cell membrane potential, transepithelial transport, maintenance of intracellular pH, and regulation of cell volume. Chloride intracellular channel 3 is a member of the p64 family and is predominantly localized in the nucleus and stimulates chloride ion channel activity. In addition, this protein may participate in cellular growth control, based on its association with ERK7, a member of the MAP kinase family. |
| TNFRSF19 | 55504 | tumor necrosis factor receptor superfamily member 19 | ENSG00000127863 | The protein encoded by this gene is a member of the TNF-receptor superfamily. This receptor is highly expressed during embryonic development. It has been shown to interact with TRAF family members, and to activate JNK signaling pathway when overexpressed in cells. This receptor is capable of inducing apoptosis by a caspase-independent mechanism, and it is thought to play an essential role in embryonic development. Alternatively spliced transcript variants encoding distinct isoforms have been described. |
| CDO1 | 1036 | cysteine dioxygenase type 1 | ENSG00000129596 | NA |
| SFN | 2810 | stratifin | ENSG00000175793 | NA |
| HSPB6 | 126393 | heat shock protein family B (small) member 6 | ENSG00000004776 | This locus encodes a heat shock protein. The encoded protein likely plays a role in smooth muscle relaxation. |
| RAB11FIP4 | 84440 | RAB11 family interacting protein 4 | ENSG00000131242 | Proteins of the large Rab GTPase family (see RAB1A; MIM 179508) have regulatory roles in the formation, targeting, and fusion of intracellular transport vesicles. RAB11FIP4 is one of many proteins that interact with and regulate Rab GTPases (Hales et al., 2001 [PubMed 11495908]). |
| VWA7 | 80737 | von Willebrand factor A domain containing 7 | ENSG00000204396 | NA |
| OSBPL3 | 26031 | oxysterol binding protein like 3 | ENSG00000070882 | This gene encodes a member of the oxysterol-binding protein (OSBP) family, a group of intracellular lipid receptors. Most members contain an N-terminal pleckstrin homology domain and a highly conserved C-terminal OSBP-like sterol-binding domain. The encoded protein is involved in the regulation of cell adhesion and organization of the actin cytoskeleton. Alternative splicing results in multiple transcript variants. |
| LGALS7B | 653499 | galectin 7B | ENSG00000178934 | The galectins are a family of beta-galactoside-binding proteins implicated in modulating cell-cell and cell-matrix interactions. Differential and in situ hybridization studies indicate that this lectin is specifically expressed in keratinocytes and found mainly in stratified squamous epithelium. A duplicate copy of this gene (GeneID:3963) is found adjacent to, but on the opposite strand on chromosome 19. |
| AC002398.12 | ENSG00000267328 | NA | ENSG00000267328 | NA |
| PKDCC | 91461 | protein kinase domain containing, cytoplasmic | ENSG00000162878 | NA |
| GSTO2 | 119391 | glutathione S-transferase omega 2 | ENSG00000065621 | The protein encoded by this gene is an omega class glutathione S-transferase (GST). GSTs are involved in the metabolism of xenobiotics and carcinogens. Four transcript variants encoding different isoforms have been found for this gene. |
| AC132217.4 | ENSG00000240801 | NA | ENSG00000240801 | NA |
| TNS2 | 23371 | tensin 2 | ENSG00000111077 | The protein encoded by this gene belongs to the tensin family. Tensin is a focal adhesion molecule that binds to actin filaments and participates in signaling pathways. This protein plays a role in regulating cell migration. Alternative splicing occurs at this locus and three transcript variants encoding three distinct isoforms have been identified. |
| STON1 | 11037 | stonin 1 | ENSG00000243244 | Endocytosis of cell surface proteins is mediated by a complex molecular machinery that assembles on the inner surface of the plasma membrane. This gene encodes one of two human homologs of the Drosophila melanogaster stoned B protein. This protein is related to components of the endocytic machinery and exhibits a modular structure consisting of an N-terminal proline-rich domain, a central region of homology specific to the human stoned B-like proteins, and a C-terminal region homologous to the mu subunits of adaptor protein (AP) complexes. Read-through transcription of this gene into the neighboring downstream gene, which encodes TFIIA-alpha/beta-like factor, generates a transcript (SALF), which encodes a fusion protein comprised of sequence sharing identity with each individual gene product. Alternative splicing results in multiple transcript variants. |
| RARRES2 | 5919 | retinoic acid receptor responder 2 | ENSG00000106538 | This gene encodes a secreted chemotactic protein that initiates chemotaxis via the ChemR23 G protein-coupled seven-transmembrane domain ligand. Expression of this gene is upregulated by the synthetic retinoid tazarotene and occurs in a wide variety of tissues. The active protein has several roles, including that as an adipokine and as an antimicrobial protein with activity against bacteria and fungi. |
| INPP5J | 27124 | inositol polyphosphate-5-phosphatase J | ENSG00000185133 | NA |
| CTC-550B14.7 | ENSG00000267265 | NA | ENSG00000267265 | NA |
| EPCAM | 4072 | epithelial cell adhesion molecule | ENSG00000119888 | This gene encodes a carcinoma-associated antigen and is a member of a family that includes at least two type I membrane proteins. This antigen is expressed on most normal epithelial cells and gastrointestinal carcinomas and functions as a homotypic calcium-independent cell adhesion molecule. The antigen is being used as a target for immunotherapy treatment of human carcinomas. Mutations in this gene result in congenital tufting enteropathy. |
| AF127936.9 | ENSG00000235609 | NA | ENSG00000235609 | NA |
| AF127577.10 | ENSG00000229047 | NA | ENSG00000229047 | NA |
| ZG16B | 124220 | zymogen granule protein 16B | ENSG00000162078 | NA |
| STEAP1 | 26872 | six transmembrane epithelial antigen of the prostate 1 | ENSG00000164647 | This gene is predominantly expressed in prostate tissue, and is found to be upregulated in multiple cancer cell lines. The gene product is predicted to be a six-transmembrane protein, and was shown to be a cell surface antigen significantly expressed at cell-cell junctions. |
| PTH1R | 5745 | parathyroid hormone 1 receptor | ENSG00000160801 | The protein encoded by this gene is a member of the G-protein coupled receptor family 2. This protein is a receptor for parathyroid hormone (PTH) and for parathyroid hormone-like hormone (PTHLH). The activity of this receptor is mediated by G proteins which activate adenylyl cyclase and also a phosphatidylinositol-calcium second messenger system. Defects in this receptor are known to be the cause of Jansen’s metaphyseal chondrodysplasia (JMC), chondrodysplasia Blomstrand type (BOCD), as well as enchodromatosis. Two transcript variants encoding the same protein have been found for this gene. |
| MYH6 | 4624 | myosin, heavy chain 6, cardiac muscle, alpha | ENSG00000197616 | Cardiac muscle myosin is a hexamer consisting of two heavy chain subunits, two light chain subunits, and two regulatory subunits. This gene encodes the alpha heavy chain subunit of cardiac myosin. The gene is located 4kb downstream of the gene encoding the beta heavy chain subunit of cardiac myosin. Mutations in this gene cause familial hypertrophic cardiomyopathy and atrial septal defect 3. |
| MAPKAPK3 | 7867 | mitogen-activated protein kinase-activated protein kinase 3 | ENSG00000114738 | This gene encodes a member of the Ser/Thr protein kinase family. This kinase functions as a mitogen-activated protein kinase (MAP kinase)- activated protein kinase. MAP kinases are also known as extracellular signal-regulated kinases (ERKs), act as an integration point for multiple biochemical signals. This kinase was shown to be activated by growth inducers and stress stimulation of cells. In vitro studies demonstrated that ERK, p38 MAP kinase and Jun N-terminal kinase were all able to phosphorylate and activate this kinase, which suggested the role of this kinase as an integrative element of signaling in both mitogen and stress responses. This kinase was reported to interact with, phosphorylate and repress the activity of E47, which is a basic helix-loop-helix transcription factor known to be involved in the regulation of tissue-specific gene expression and cell differentiation. Alternate splicing results in multiple transcript variants that encode the same protein. |
| BMP1 | 649 | bone morphogenetic protein 1 | ENSG00000168487 | This gene encodes a protein that is capable of inducing formation of cartilage in vivo. Although other bone morphogenetic proteins are members of the TGF-beta superfamily, this gene encodes a protein that is not closely related to other known growth factors. This gene is expressed as alternatively spliced variants that share an N-terminal protease domain but differ in their C-terminal region. |
| CSRP3 | 8048 | cysteine and glycine rich protein 3 | ENSG00000129170 | This gene encodes a member of the CSRP family of LIM domain proteins, which may be involved in regulatory processes important for development and cellular differentiation. The LIM/double zinc-finger motif found in this protein is found in a group of proteins with critical functions in gene regulation, cell growth, and somatic differentiation. Mutations in this gene are thought to cause heritable forms of hypertrophic cardiomyopathy (HCM) and dilated cardiomyopathy (DCM) in humans. Alternatively spliced transcript variants with different 5’ UTR, but encoding the same protein, have been found for this gene. |
| RP4-564F22.5 | ENSG00000224635 | NA | ENSG00000224635 | NA |
| ANGPTL8 | 55908 | angiopoietin like 8 | ENSG00000130173 | NA |
| TG | 7038 | thyroglobulin | ENSG00000042832 | Thyroglobulin (Tg) is a glycoprotein homodimer produced predominantly by the thryroid gland. It acts as a substrate for the synthesis of thyroxine and triiodothyronine as well as the storage of the inactive forms of thyroid hormone and iodine. Thyroglobulin is secreted from the endoplasmic reticulum to its site of iodination, and subsequent thyroxine biosynthesis, in the follicular lumen. Mutations in this gene cause thyroid dyshormonogenesis, manifested as goiter, and are associated with moderate to severe congenital hypothyroidism. Polymorphisms in this gene are associated with susceptibility to autoimmune thyroid diseases (AITD) such as Graves disease and Hashimoto thryoiditis. |
| ACTA1 | 58 | actin, alpha 1, skeletal muscle | ENSG00000143632 | The product encoded by this gene belongs to the actin family of proteins, which are highly conserved proteins that play a role in cell motility, structure and integrity. Alpha, beta and gamma actin isoforms have been identified, with alpha actins being a major constituent of the contractile apparatus, while beta and gamma actins are involved in the regulation of cell motility. This actin is an alpha actin that is found in skeletal muscle. Mutations in this gene cause nemaline myopathy type 3, congenital myopathy with excess of thin myofilaments, congenital myopathy with cores, and congenital myopathy with fiber-type disproportion, diseases that lead to muscle fiber defects. |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",13,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[14,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | X_id | query | name | summary | notfound |
|---|---|---|---|---|---|
| TNFRSF11B | 4982 | ENSG00000164761 | tumor necrosis factor receptor superfamily member 11b | The protein encoded by this gene is a member of the TNF-receptor superfamily. This protein is an osteoblast-secreted decoy receptor that functions as a negative regulator of bone resorption. This protein specifically binds to its ligand, osteoprotegerin ligand, both of which are key extracellular regulators of osteoclast development. Studies of the mouse counterpart also suggest that this protein and its ligand play a role in lymph-node organogenesis and vascular calcification. Alternatively spliced transcript variants of this gene have been reported, but their full length nature has not been determined. | NA |
| NA | NA | ENSG00000117289 | NA | NA | TRUE |
| TNFSF10 | 8743 | ENSG00000121858 | tumor necrosis factor superfamily member 10 | The protein encoded by this gene is a cytokine that belongs to the tumor necrosis factor (TNF) ligand family. This protein preferentially induces apoptosis in transformed and tumor cells, but does not appear to kill normal cells although it is expressed at a significant level in most normal tissues. This protein binds to several members of TNF receptor superfamily including TNFRSF10A/TRAILR1, TNFRSF10B/TRAILR2, TNFRSF10C/TRAILR3, TNFRSF10D/TRAILR4, and possibly also to TNFRSF11B/OPG. The activity of this protein may be modulated by binding to the decoy receptors TNFRSF10C/TRAILR3, TNFRSF10D/TRAILR4, and TNFRSF11B/OPG that cannot induce apoptosis. The binding of this protein to its receptors has been shown to trigger the activation of MAPK8/JNK, caspase 8, and caspase 3. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| SLC2A4 | 6517 | ENSG00000181856 | solute carrier family 2 member 4 | This gene is a member of the solute carrier family 2 (facilitated glucose transporter) family and encodes a protein that functions as an insulin-regulated facilitative glucose transporter. In the absence of insulin, this integral membrane protein is sequestered within the cells of muscle and adipose tissue. Within minutes of insulin stimulation, the protein moves to the cell surface and begins to transport glucose across the cell membrane. Mutations in this gene have been associated with noninsulin-dependent diabetes mellitus (NIDDM). | NA |
| HSPA7 | ENSG00000225217 | ENSG00000225217 | heat shock protein family A (Hsp70) member 7 | NA | NA |
| GPR176 | 11245 | ENSG00000166073 | G protein-coupled receptor 176 | Members of the G protein-coupled receptor family, such as GPR176, are cell surface receptors involved in responses to hormones, growth factors, and neurotransmitters (Hata et al., 1995 [PubMed 7893747]). | NA |
| SNCA | 6622 | ENSG00000145335 | synuclein alpha | Alpha-synuclein is a member of the synuclein family, which also includes beta- and gamma-synuclein. Synucleins are abundantly expressed in the brain and alpha- and beta-synuclein inhibit phospholipase D2 selectively. SNCA may serve to integrate presynaptic signaling and membrane trafficking. Defects in SNCA have been implicated in the pathogenesis of Parkinson disease. SNCA peptides are a major component of amyloid plaques in the brains of patients with Alzheimer’s disease. Four alternatively spliced transcripts encoding two different isoforms have been identified for this gene. | NA |
| P2RY1 | 5028 | ENSG00000169860 | purinergic receptor P2Y1 | The product of this gene belongs to the family of G-protein coupled receptors. This family has several receptor subtypes with different pharmacological selectivity, which overlaps in some cases, for various adenosine and uridine nucleotides. This receptor functions as a receptor for extracellular ATP and ADP. In platelets binding to ADP leads to mobilization of intracellular calcium ions via activation of phospholipase C, a change in platelet shape, and probably to platelet aggregation. | NA |
| ST6GALNAC2 | 10610 | ENSG00000070731 | ST6 N-acetylgalactosaminide alpha-2,6-sialyltransferase 2 | ST6GALNAC2 belongs to a family of sialyltransferases that add sialic acids to the nonreducing ends of glycoconjugates. At the cell surface, these modifications have roles in cell-cell and cell-substrate interactions, bacterial adhesion, and protein targeting (Samyn-Petit et al., 2000 [PubMed 10742600]). | NA |
| CST6 | 1474 | ENSG00000175315 | cystatin E/M | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins and the kininogens. The type 2 cystatin proteins are a class of cysteine proteinase inhibitors found in a variety of human fluids and secretions, where they appear to provide protective functions. This gene encodes a cystatin from the type 2 family, which is down-regulated in metastatic breast tumor cells as compared to primary tumor cells. Loss of expression is likely associated with the progression of a primary tumor to a metastatic phenotype. | NA |
| MTHFD1L | 25902 | ENSG00000120254 | methylenetetrahydrofolate dehydrogenase (NADP+ dependent) 1-like | The protein encoded by this gene is involved in the synthesis of tetrahydrofolate (THF) in the mitochondrion. THF is important in the de novo synthesis of purines and thymidylate and in the regeneration of methionine from homocysteine. Several transcript variants encoding different isoforms have been found for this gene. | NA |
| SERPINE1 | 5054 | ENSG00000106366 | serpin family E member 1 | This gene encodes a member of the serine proteinase inhibitor (serpin) superfamily. This member is the principal inhibitor of tissue plasminogen activator (tPA) and urokinase (uPA), and hence is an inhibitor of fibrinolysis. Defects in this gene are the cause of plasminogen activator inhibitor-1 deficiency (PAI-1 deficiency), and high concentrations of the gene product are associated with thrombophilia. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| RP11-389C8.2 | ENSG00000261269 | ENSG00000261269 | NA | NA | NA |
| ZNF215 | 7762 | ENSG00000149054 | zinc finger protein 215 | NA | NA |
| H19 | 283120 | ENSG00000130600 | H19, imprinted maternally expressed transcript (non-protein coding) | This gene is located in an imprinted region of chromosome 11 near the insulin-like growth factor 2 (IGF2) gene. This gene is only expressed from the maternally-inherited chromosome, whereas IGF2 is only expressed from the paternally-inherited chromosome. The product of this gene is a long non-coding RNA which functions as a tumor suppressor. Mutations in this gene have been associated with Beckwith-Wiedemann Syndrome and Wilms tumorigenesis. Alternative splicing results in multiple transcript variants. | NA |
| PLIN5 | 440503 | ENSG00000214456 | perilipin 5 | Members of the perilipin family, such as PLIN5, coat intracellular lipid storage droplets and protect them from lipolytic degradation (Dalen et al., 2007 [PubMed 17234449]). | NA |
| ADRB2 | 154 | ENSG00000169252 | adrenoceptor beta 2 | This gene encodes beta-2-adrenergic receptor which is a member of the G protein-coupled receptor superfamily. This receptor is directly associated with one of its ultimate effectors, the class C L-type calcium channel Ca(V)1.2. This receptor-channel complex also contains a G protein, an adenylyl cyclase, cAMP-dependent kinase, and the counterbalancing phosphatase, PP2A. The assembly of the signaling complex provides a mechanism that ensures specific and rapid signaling by this G protein-coupled receptor. This gene is intronless. Different polymorphic forms, point mutations, and/or downregulation of this gene are associated with nocturnal asthma, obesity and type 2 diabetes. | NA |
| RBP7 | 116362 | ENSG00000162444 | retinol binding protein 7 | Due to its chemical instability and low solubility in aqueous solution, vitamin A requires cellular retinol-binding proteins (CRBPs), such as RBP7, for stability, internalization, intercellular transfer, homeostasis, and metabolism. | NA |
| CARD14 | 79092 | ENSG00000141527 | caspase recruitment domain family member 14 | This gene encodes a caspase recruitment domain-containing protein that is a member of the membrane-associated guanylate kinase (MAGUK) family of proteins. Members of this protein family are scaffold proteins that are involved in a diverse array of cellular processes including cellular adhesion, signal transduction and cell polarity control. This protein has been shown to specifically interact with BCL10, a protein known to function as a positive regulator of cell apoptosis and NF-kappaB activation. Alternate splicing results in multiple transcript variants. | NA |
| HCP5 | 10866 | ENSG00000206337 | HLA complex P5 (non-protein coding) | NA | NA |
| SYNGR1 | 9145 | ENSG00000100321 | synaptogyrin 1 | This gene encodes an integral membrane protein associated with presynaptic vesicles in neuronal cells. The exact function of this protein is unclear, but studies of a similar murine protein suggest that it functions in synaptic plasticity without being required for synaptic transmission. The gene product belongs to the synaptogyrin gene family. Three alternatively spliced variants encoding three different isoforms have been identified. | NA |
| NDRG2 | 57447 | ENSG00000165795 | NDRG family member 2 | This gene is a member of the N-myc downregulated gene family which belongs to the alpha/beta hydrolase superfamily. The protein encoded by this gene is a cytoplasmic protein that may play a role in neurite outgrowth. This gene may be involved in glioblastoma carcinogenesis. Several alternatively spliced transcript variants of this gene have been described, but the full-length nature of some of these variants has not been determined. | NA |
| ITGA10 | 8515 | ENSG00000143127 | integrin subunit alpha 10 | Integrins are integral transmembrane glycoproteins composed of noncovalently linked alpha and beta chains. They participate in cell adhesion as well as cell-surface mediated signalling. This gene encodes an integrin alpha chain and is expressed at high levels in chondrocytes, where it is transcriptionally regulated by AP-2epsilon and Ets-1. The protein encoded by this gene binds to collagen. Alternative splicing results in multiple transcript variants. | NA |
| GIMAP5 | 55340 | ENSG00000196329 | GTPase, IMAP family member 5 | This gene encodes a protein belonging to the GTP-binding superfamily and to the immuno-associated nucleotide (IAN) subfamily of nucleotide-binding proteins. In humans, the IAN subfamily genes are located in a cluster at 7q36.1. This gene encodes an antiapoptotic protein that functions in T-cell survival. Polymorphisms in this gene are associated with systemic lupus erythematosus. Read-through transcription exists between this gene and the neighboring upstream GIMAP1 (GTPase, IMAP family member 1) gene. | NA |
| CTGF | 1490 | ENSG00000118523 | connective tissue growth factor | The protein encoded by this gene is a mitogen that is secreted by vascular endothelial cells. The encoded protein plays a role in chondrocyte proliferation and differentiation, cell adhesion in many cell types, and is related to platelet-derived growth factor. Certain polymorphisms in this gene have been linked with a higher incidence of systemic sclerosis. | NA |
| NA | NA | ENSG00000241732 | NA | NA | TRUE |
| ARHGAP25 | 9938 | ENSG00000163219 | Rho GTPase activating protein 25 | ARHGAPs, such as ARHGAP25, encode negative regulators of Rho GTPases (see ARHA; MIM 165390), which are implicated in actin remodeling, cell polarity, and cell migration (Katoh and Katoh, 2004 [PubMed 15254788]). | NA |
| ARHGEF4 | 50649 | ENSG00000136002 | Rho guanine nucleotide exchange factor 4 | Rho GTPases play a fundamental role in numerous cellular processes that are initiated by extracellular stimuli that work through G protein coupled receptors. The protein encoded by this gene may form complex with G proteins and stimulate Rho-dependent signals. Multiple alternatively spliced transcript variants encoding different isoforms have been found, but the full-length nature of some variants has not been determined. | NA |
| RP11-315I20.3 | ENSG00000244619 | ENSG00000244619 | NA | NA | NA |
| TRIM63 | 84676 | ENSG00000158022 | tripartite motif containing 63 | This gene encodes a member of the RING zinc finger protein family found in striated muscle and iris. The product of this gene is an E3 ubiquitin ligase that localizes to the Z-line and M-line lattices of myofibrils. This protein plays an important role in the atrophy of skeletal and cardiac muscle and is required for the degradation of myosin heavy chain proteins, myosin light chain, myosin binding protein, and for muscle-type creatine kinase. | NA |
| ABCG1 | 9619 | ENSG00000160179 | ATP binding cassette subfamily G member 1 | The protein encoded by this gene is a member of the superfamily of ATP-binding cassette (ABC) transporters. ABC proteins transport various molecules across extra- and intra-cellular membranes. ABC genes are divided into seven distinct subfamilies (ABC1, MDR/TAP, MRP, ALD, OABP, GCN20, White). This protein is a member of the White subfamily. It is involved in macrophage cholesterol and phospholipids transport, and may regulate cellular lipid homeostasis in other cell types. Six alternative splice variants have been identified. | NA |
| SPRR2E | 6704 | ENSG00000203785 | small proline rich protein 2E | This gene encodes a member of a family of small proline-rich proteins clustered in the epidermal differentiation complex on chromosome 1q21. The encoded protein, along with other family members, is a component of the cornified cell envelope that forms beneath the plasma membrane in terminally differentiated stratified squamous epithelia. This envelope serves as a barrier against extracellular and environmental factors. The seven SPRR2 genes (A-G) appear to have been homogenized by gene conversion compared to others in the cluster that exhibit greater differences in protein structure. | NA |
| TPH1 | 7166 | ENSG00000129167 | tryptophan hydroxylase 1 | This gene encodes a member of the aromatic amino acid hydroxylase family. The encoded protein catalyzes the first and rate limiting step in the biosynthesis of serotonin, an important hormone and neurotransmitter. Mutations in this gene have been associated with an elevated risk for a variety of diseases and disorders, including schizophrenia, somatic anxiety, anger-related traits, bipolar disorder, suicidal behavior, addictions, and others. | NA |
| RAI14 | 26064 | ENSG00000039560 | retinoic acid induced 14 | NA | NA |
| RHCG | 51458 | ENSG00000140519 | Rh family C glycoprotein | NA | NA |
| BCAR1 | 9564 | ENSG00000050820 | BCAR1, Cas family scaffolding protein | BCAR1, or CAS, is an Src (MIM 190090) family kinase substrate involved in various cellular events, including migration, survival, transformation, and invasion (Sawada et al., 2006 [PubMed 17129785]). | NA |
| NMNAT3 | 349565 | ENSG00000163864 | nicotinamide nucleotide adenylyltransferase 3 | This gene encodes a member of the nicotinamide/nicotinic acid mononucleotide adenylyltransferase family. These enzymes use ATP to catalyze the synthesis of nicotinamide adenine dinucleotide or nicotinic acid adenine dinucleotide from nicotinamide mononucleotide or nicotinic acid mononucleotide, respectively. The encoded protein is localized to mitochondria and may also play a neuroprotective role as a molecular chaperone. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| CELSR3 | 1951 | ENSG00000008300 | cadherin EGF LAG seven-pass G-type receptor 3 | This gene belongs to the flamingo subfamily, which is included in the cadherin superfamily. The flamingo cadherins consist of nonclassic-type cadherins that do not interact with catenins. They are plasma membrane proteins containing seven epidermal growth factor-like repeats, nine cadherin domains and two laminin A G-type repeats in their ectodomain. They also have seven transmembrane domains, a characteristic feature of their subfamily. The encoded protein may be involved in the regulation of contact-dependent neurite growth and may play a role in tumor formation. | NA |
| ABLIM1 | 3983 | ENSG00000099204 | actin binding LIM protein 1 | This gene encodes a cytoskeletal LIM protein that binds to actin filaments via a domain that is homologous to erythrocyte dematin. LIM domains, found in over 60 proteins, play key roles in the regulation of developmental pathways. LIM domains also function as protein-binding interfaces, mediating specific protein-protein interactions. The protein encoded by this gene could mediate such interactions between actin filaments and cytoplasmic targets. Alternatively spliced transcript variants encoding different isoforms have been identified. | NA |
| SOX15 | 6665 | ENSG00000129194 | SRY-box 15 | This gene encodes a member of the SOX (SRY-related HMG-box) family of transcription factors involved in the regulation of embryonic development and in the determination of the cell fate. The encoded protein may act as a transcriptional regulator after forming a protein complex with other proteins. | NA |
| SIPA1L2 | 57568 | ENSG00000116991 | signal induced proliferation associated 1 like 2 | This gene encodes a member of the signal-induced proliferation-associated 1 like family. Members of this family contain a GTPase activating domain, a PDZ domain and a C-terminal coiled-coil domain with a leucine zipper. A similar protein in rat acts as a GTPases for the small GTPase Rap. | NA |
| MYH10 | 4628 | ENSG00000133026 | myosin, heavy chain 10, non-muscle | This gene encodes a member of the myosin superfamily. The protein represents a conventional non-muscle myosin; it should not be confused with the unconventional myosin-10 (MYO10). Myosins are actin-dependent motor proteins with diverse functions including regulation of cytokinesis, cell motility, and cell polarity. Mutations in this gene have been associated with May-Hegglin anomaly and developmental defects in brain and heart. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| VASN | 114990 | ENSG00000168140 | vasorin | NA | NA |
| FXYD6 | 53826 | ENSG00000137726 | FXYD domain containing ion transport regulator 6 | This gene encodes a member of the FXYD family of transmembrane proteins. This particular protein encodes phosphohippolin, which likely affects the activity of Na,K-ATPase. Multiple alternatively spliced transcript variants encoding the same protein have been described. Related pseudogenes have been identified on chromosomes 10 and X. Read-through transcripts have been observed between this locus and the downstream sodium/potassium-transporting ATPase subunit gamma (FXYD2, GeneID 486) locus. | NA |
| AC084809.2 | ENSG00000226377 | ENSG00000226377 | NA | NA | NA |
| CNFN | 84518 | ENSG00000105427 | cornifelin | NA | NA |
| LRRN2 | 10446 | ENSG00000170382 | leucine rich repeat neuronal 2 | The protein encoded by this gene belongs to the leucine-rich repeat superfamily. This gene was found to be amplified and overexpressed in malignant gliomas. The encoded protein has homology with other proteins that function as cell-adhesion molecules or as signal transduction receptors and is a candidate for the target gene in the 1q32.1 amplicon in malignant gliomas. Two alternatively spliced transcript variants encoding the same protein have been described for this gene. | NA |
| CEACAM1 | 634 | ENSG00000079385 | carcinoembryonic antigen related cell adhesion molecule 1 | This gene encodes a member of the carcinoembryonic antigen (CEA) gene family, which belongs to the immunoglobulin superfamily. Two subgroups of the CEA family, the CEA cell adhesion molecules and the pregnancy-specific glycoproteins, are located within a 1.2 Mb cluster on the long arm of chromosome 19. Eleven pseudogenes of the CEA cell adhesion molecule subgroup are also found in the cluster. The encoded protein was originally described in bile ducts of liver as biliary glycoprotein. Subsequently, it was found to be a cell-cell adhesion molecule detected on leukocytes, epithelia, and endothelia. The encoded protein mediates cell adhesion via homophilic as well as heterophilic binding to other proteins of the subgroup. Multiple cellular activities have been attributed to the encoded protein, including roles in the differentiation and arrangement of tissue three-dimensional structure, angiogenesis, apoptosis, tumor suppression, metastasis, and the modulation of innate and adaptive immune responses. Multiple transcript variants encoding different isoforms have been reported, but the full-length nature of all variants has not been defined. | NA |
| TMCC3 | 57458 | ENSG00000057704 | transmembrane and coiled-coil domain family 3 | NA | NA |
| FCGR3B | 2215 | ENSG00000162747 | Fc fragment of IgG receptor IIIb | The protein encoded by this gene is a low affinity receptor for the Fc region of gamma immunoglobulins (IgG). The encoded protein acts as a monomer and can bind either monomeric or aggregated IgG. This gene may function to capture immune complexes in the peripheral circulation. Several transcript variants encoding different isoforms have been found for this gene. A highly-similar gene encoding a related protein is also found on chromosome 1. | NA |
| SH2D3C | 10044 | ENSG00000095370 | SH2 domain containing 3C | This gene encodes an adaptor protein and member of a cytoplasmic protein family involved in cell migration. The encoded protein contains a putative Src homology 2 (SH2) domain and guanine nucleotide exchange factor-like domain which allows this signaling protein to form a complex with scaffolding protein Crk-associated substrate. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| N4BP3 | 23138 | ENSG00000145911 | NEDD4 binding protein 3 | NA | NA |
| CD34 | 947 | ENSG00000174059 | CD34 molecule | The protein encoded by this gene may play a role in the attachment of stem cells to the bone marrow extracellular matrix or to stromal cells. This single-pass membrane protein is highly glycosylated and phosphorylated by protein kinase C. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| RNF125 | 54941 | ENSG00000101695 | ring finger protein 125 | This gene encodes a novel E3 ubiquitin ligase that contains a RING finger domain in the N-terminus and three zinc-binding and one ubiquitin-interacting motif in the C-terminus. As a result of myristoylation, this protein associates with membranes and is primarily localized to intracellular membrane systems. The encoded protein may function as a positive regulator in the T-cell receptor signaling pathway. | NA |
| TGM3 | 7053 | ENSG00000125780 | transglutaminase 3 | Transglutaminases are enzymes that catalyze the crosslinking of proteins by epsilon-gamma glutamyl lysine isopeptide bonds. While the primary structure of transglutaminases is not conserved, they all have the same amino acid sequence at their active sites and their activity is calcium-dependent. The protein encoded by this gene consists of two polypeptide chains activated from a single precursor protein by proteolysis. The encoded protein is involved the later stages of cell envelope formation in the epidermis and hair follicle. | NA |
| NEURL1B | 54492 | ENSG00000214357 | neuralized E3 ubiquitin protein ligase 1B | NA | NA |
| RP11-688G15.3 | ENSG00000258749 | ENSG00000258749 | NA | NA | NA |
| HIGD1B | 51751 | ENSG00000131097 | HIG1 hypoxia inducible domain family member 1B | This gene encodes a member of the hypoxia inducible gene 1 (HIG1) domain family. The encoded protein is localized to the cell membrane and has been linked to tumorigenesis and the progression of pituitary adenomas. Alternative splicing results in multiple transcript variants. | NA |
| FGF11 | 2256 | ENSG00000161958 | fibroblast growth factor 11 | The protein encoded by this gene is a member of the fibroblast growth factor (FGF) family. FGF family members possess broad mitogenic and cell survival activities, and are involved in a variety of biological processes, including embryonic development, cell growth, morphogenesis, tissue repair, tumor growth and invasion. The function of this gene has not yet been determined. The expression pattern of the mouse homolog implies a role in nervous system development. Alternative splicing results in multiple transcript variants. | NA |
| RP11-673E1.3 | ENSG00000249741 | ENSG00000249741 | NA | NA | NA |
| DOCK8 | 81704 | ENSG00000107099 | dedicator of cytokinesis 8 | This gene encodes a member of the DOCK180 family of guanine nucleotide exchange factors. Guanine nucleotide exchange factors interact with Rho GTPases and are components of intracellular signaling networks. Mutations in this gene result in the autosomal recessive form of the hyper-IgE syndrome. Alternatively spliced transcript variants encoding different isoforms have been described. | NA |
| PLBD1 | 79887 | ENSG00000121316 | phospholipase B domain containing 1 | NA | NA |
| S100A9 | 6280 | ENSG00000163220 | S100 calcium binding protein A9 | The protein encoded by this gene is a member of the S100 family of proteins containing 2 EF-hand calcium-binding motifs. S100 proteins are localized in the cytoplasm and/or nucleus of a wide range of cells, and involved in the regulation of a number of cellular processes such as cell cycle progression and differentiation. S100 genes include at least 13 members which are located as a cluster on chromosome 1q21. This protein may function in the inhibition of casein kinase and altered expression of this protein is associated with the disease cystic fibrosis. This antimicrobial protein exhibits antifungal and antibacterial activity. | NA |
| SMIM5 | 643008 | ENSG00000204323 | small integral membrane protein 5 | NA | NA |
| CTHRC1 | 115908 | ENSG00000164932 | collagen triple helix repeat containing 1 | This locus encodes a protein that may play a role in the cellular response to arterial injury through involvement in vascular remodeling. Mutations at this locus have been associated with Barrett esophagus and esophageal adenocarcinoma. Alternatively spliced transcript variants have been described. | NA |
| C1QB | 713 | ENSG00000173369 | complement component 1, q subcomponent, B chain | This gene encodes a major constituent of the human complement subcomponent C1q. C1q associates with C1r and C1s in order to yield the first component of the serum complement system. Deficiency of C1q has been associated with lupus erythematosus and glomerulonephritis. C1q is composed of 18 polypeptide chains: six A-chains, six B-chains, and six C-chains. Each chain contains a collagen-like region located near the N terminus and a C-terminal globular region. The A-, B-, and C-chains are arranged in the order A-C-B on chromosome 1. This gene encodes the B-chain polypeptide of human complement subcomponent C1q | NA |
| PCP4L1 | 654790 | ENSG00000248485 | Purkinje cell protein 4 like 1 | NA | NA |
| CSTA | 1475 | ENSG00000121552 | cystatin A | The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins, and kininogens. This gene encodes a stefin that functions as a cysteine protease inhibitor, forming tight complexes with papain and the cathepsins B, H, and L. The protein is one of the precursor proteins of cornified cell envelope in keratinocytes and plays a role in epidermal development and maintenance. Stefins have been proposed as prognostic and diagnostic tools for cancer. | NA |
| POPDC2 | 64091 | ENSG00000121577 | popeye domain containing 2 | This gene encodes a member of the POP family of proteins which contain three putative transmembrane domains. This membrane associated protein is predominantly expressed in skeletal and cardiac muscle, and may have an important function in these tissues. | NA |
| CTD-2201I18.1 | 101929215 | ENSG00000249825 | uncharacterized LOC101929215 | NA | NA |
| GAS6-AS1 | ENSG00000233695 | ENSG00000233695 | GAS6 antisense RNA 1 | NA | NA |
| KLHDC8B | 200942 | ENSG00000185909 | kelch domain containing 8B | This gene encodes a protein which forms a distinct beta-propeller protein structure of kelch domains allowing for protein-protein interactions. Mutations in this gene have been associated with Hodgkin lymphoma. | NA |
| NA | NA | ENSG00000180672 | NA | NA | TRUE |
| KRT5 | 3852 | ENSG00000186081 | keratin 5 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the basal layer of the epidermis with family member KRT14. Mutations in these genes have been associated with a complex of diseases termed epidermolysis bullosa simplex. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | NA |
| RP11-728F11.4 | ENSG00000254528 | ENSG00000254528 | NA | NA | NA |
| SLC1A1 | 6505 | ENSG00000106688 | solute carrier family 1 member 1 | This gene encodes a member of the high-affinity glutamate transporters that play an essential role in transporting glutamate across plasma membranes. In brain, these transporters are crucial in terminating the postsynaptic action of the neurotransmitter glutamate, and in maintaining extracellular glutamate concentrations below neurotoxic levels. This transporter also transports aspartate, and mutations in this gene are thought to cause dicarboxylicamino aciduria, also known as glutamate-aspartate transport defect. | NA |
| NCF2 | 4688 | ENSG00000116701 | neutrophil cytosolic factor 2 | This gene encodes neutrophil cytosolic factor 2, the 67-kilodalton cytosolic subunit of the multi-protein NADPH oxidase complex found in neutrophils. This oxidase produces a burst of superoxide which is delivered to the lumen of the neutrophil phagosome. Mutations in this gene, as well as in other NADPH oxidase subunits, can result in chronic granulomatous disease, a disease that causes recurrent infections by catalase-positive organisms. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| TCP11L2 | 255394 | ENSG00000166046 | t-complex 11 like 2 | NA | NA |
| MYBL1 | 4603 | ENSG00000185697 | MYB proto-oncogene like 1 | NA | NA |
| THY1 | 7070 | ENSG00000154096 | Thy-1 cell surface antigen | This gene encodes a cell surface glycoprotein and member of the immunoglobulin superfamily of proteins. The encoded protein is involved in cell adhesion and cell communication in numerous cell types, but particularly in cells of the immune and nervous systems. The encoded protein is widely used as a marker for hematopoietic stem cells. This gene may function as a tumor suppressor in nasopharyngeal carcinoma. Alternative splicing results in multiple transcript variants. | NA |
| CPT1B | 1375 | ENSG00000205560 | carnitine palmitoyltransferase 1B | The protein encoded by this gene, a member of the carnitine/choline acetyltransferase family, is the rate-controlling enzyme of the long-chain fatty acid beta-oxidation pathway in muscle mitochondria. This enzyme is required for the net transport of long-chain fatty acyl-CoAs from the cytoplasm into the mitochondria. Multiple transcript variants encoding different isoforms have been found for this gene, and read-through transcripts are expressed from the upstream locus that include exons from this gene. | NA |
| CHL1 | 10752 | ENSG00000134121 | cell adhesion molecule L1 like | The protein encoded by this gene is a member of the L1 gene family of neural cell adhesion molecules. It is a neural recognition molecule that may be involved in signal transduction pathways. The deletion of one copy of this gene may be responsible for mental defects in patients with 3p- syndrome. This protein may also play a role in the growth of certain cancers. Alternate splicing results in both coding and non-coding variants. | NA |
| RP11-334E6.12 | ENSG00000263873 | ENSG00000263873 | NA | NA | NA |
| RP11-350G8.9 | ENSG00000273110 | ENSG00000273110 | NA | NA | NA |
| MYO15B | ENSG00000266714 | ENSG00000266714 | myosin XVB | NA | NA |
| LMO7 | 4008 | ENSG00000136153 | LIM domain 7 | This gene encodes a protein containing a calponin homology (CH) domain, a PDZ domain, and a LIM domain, and may be involved in protein-protein interactions. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene, however, the full-length nature of some variants is not known. | NA |
| ABI3 | 51225 | ENSG00000108798 | ABI family member 3 | This gene encodes a member of an adaptor protein family. Members of this family encode proteins containing a homeobox homology domain, proline rich region and Src-homology 3 (SH3) domain, and are components of the Abi/WAVE complex which regulates actin polymerization. The encoded protein inhibits ectopic metastasis of tumor cells as well as cell migration. This may be accomplished through interaction with p21-activated kinase. Alternative splicing results in multiple transcript variants. | NA |
| NCF4 | 4689 | ENSG00000100365 | neutrophil cytosolic factor 4 | The protein encoded by this gene is a cytosolic regulatory component of the superoxide-producing phagocyte NADPH-oxidase, a multicomponent enzyme system important for host defense. This protein is preferentially expressed in cells of myeloid lineage. It interacts primarily with neutrophil cytosolic factor 2 (NCF2/p67-phox) to form a complex with neutrophil cytosolic factor 1 (NCF1/p47-phox), which further interacts with the small G protein RAC1 and translocates to the membrane upon cell stimulation. This complex then activates flavocytochrome b, the membrane-integrated catalytic core of the enzyme system. The PX domain of this protein can bind phospholipid products of the PI(3) kinase, which suggests its role in PI(3) kinase-mediated signaling events. The phosphorylation of this protein was found to negatively regulate the enzyme activity. Alternatively spliced transcript variants encoding distinct isoforms have been observed. | NA |
| MYO1D | 4642 | ENSG00000176658 | myosin ID | NA | NA |
| FN1 | 2335 | ENSG00000115414 | fibronectin 1 | This gene encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. The encoded preproprotein is proteolytically processed to generate the mature protein. Fibronectin is involved in cell adhesion and migration processes including embryogenesis, wound healing, blood coagulation, host defense, and metastasis. The gene has three regions subject to alternative splicing, with the potential to produce 20 different transcript variants, at least one of which encodes an isoform that undergoes proteolytic processing. The full-length nature of some variants has not been determined. | NA |
| STAB1 | 23166 | ENSG00000010327 | stabilin 1 | This gene encodes a large, transmembrane receptor protein which may function in angiogenesis, lymphocyte homing, cell adhesion, or receptor scavenging. The protein contains 7 fasciclin, 16 epidermal growth factor (EGF)-like, and 2 laminin-type EGF-like domains as well as a C-type lectin-like hyaluronan-binding Link module. The protein is primarily expressed on sinusoidal endothelial cells of liver, spleen, and lymph node. The receptor has been shown to endocytose ligands such as low density lipoprotein, Gram-positive and Gram-negative bacteria, and advanced glycosylation end products. Supporting its possible role as a scavenger receptor, the protein rapidly cycles between the plasma membrane and early endosomes. | NA |
| OLR1 | 4973 | ENSG00000173391 | oxidized low density lipoprotein receptor 1 | This gene encodes a low density lipoprotein receptor that belongs to the C-type lectin superfamily. This gene is regulated through the cyclic AMP signaling pathway. The encoded protein binds, internalizes and degrades oxidized low-density lipoprotein. This protein may be involved in the regulation of Fas-induced apoptosis. This protein may play a role as a scavenger receptor. Mutations of this gene have been associated with atherosclerosis, risk of myocardial infarction, and may modify the risk of Alzheimer’s disease. Alternate splicing results in multiple transcript variants. | NA |
| RP11-169D4.2 | ENSG00000256633 | ENSG00000256633 | NA | NA | NA |
| TMEM88 | 92162 | ENSG00000167874 | transmembrane protein 88 | NA | NA |
| IGFBP2 | 3485 | ENSG00000115457 | insulin like growth factor binding protein 2 | The protein encoded by this gene is one of six similar proteins that bind insulin-like growth factors I and II (IGF-I and IGF-II). The encoded protein can be secreted into the bloodstream, where it binds IGF-I and IGF-II with high affinity, or it can remain intracellular, interacting with many different ligands. High expression levels of this protein promote the growth of several types of tumors and may be predictive of the chances of recovery of the patient. Several transcript variants, one encoding a secreted isoform and the others encoding nonsecreted isoforms, have been found for this gene. | NA |
| EGLN3 | 112399 | ENSG00000129521 | egl-9 family hypoxia inducible factor 3 | NA | NA |
| SYNM | 23336 | ENSG00000182253 | synemin | The protein encoded by this gene is an intermediate filament (IF) family member. IF proteins are cytoskeletal proteins that confer resistance to mechanical stress and are encoded by a dispersed multigene family. This protein has been found to form a linkage between desmin, which is a subunit of the IF network, and the extracellular matrix, and provides an important structural support in muscle. Two alternatively spliced variants encoding different isoforms have been described for this gene. | NA |
| SLC39A14 | 23516 | ENSG00000104635 | solute carrier family 39 member 14 | Zinc is an essential cofactor for hundreds of enzymes. It is involved in protein, nucleic acid, carbohydrate, and lipid metabolism, as well as in the control of gene transcription, growth, development, and differentiation. SLC39A14 belongs to a subfamily of proteins that show structural characteristics of zinc transporters (Taylor and Nicholson, 2003 [PubMed 12659941]). | NA |
| ATP1B2 | 482 | ENSG00000129244 | ATPase Na+/K+ transporting subunit beta 2 | The protein encoded by this gene belongs to the family of Na+/K+ and H+/K+ ATPases beta chain proteins, and to the subfamily of Na+/K+ -ATPases. Na+/K+ -ATPase is an integral membrane protein responsible for establishing and maintaining the electrochemical gradients of Na and K ions across the plasma membrane. These gradients are essential for osmoregulation, for sodium-coupled transport of a variety of organic and inorganic molecules, and for electrical excitability of nerve and muscle. This enzyme is composed of two subunits, a large catalytic subunit (alpha) and a smaller glycoprotein subunit (beta). The beta subunit regulates, through assembly of alpha/beta heterodimers, the number of sodium pumps transported to the plasma membrane. The glycoprotein subunit of Na+/K+ -ATPase is encoded by multiple genes. This gene encodes a beta 2 subunit. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| CTB-79E8.2 | ENSG00000253445 | ENSG00000253445 | NA | NA | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",14,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[15,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| summary | X_id | symbol | name | query | notfound |
|---|---|---|---|---|---|
| The protein encoded by this gene is a member of the CXC chemokine family. This chemokine is one of the major mediators of the inflammatory response. This chemokine is secreted by several cell types. It functions as a chemoattractant, and is also a potent angiogenic factor. This gene is believed to play a role in the pathogenesis of bronchiolitis, a common respiratory tract disease caused by viral infection. This gene and other ten members of the CXC chemokine gene family form a chemokine gene cluster in a region mapped to chromosome 4q. | 3576 | CXCL8 | C-X-C motif chemokine ligand 8 | ENSG00000169429 | NA |
| The protein encoded by this gene coats lipid storage droplets in adipocytes, thereby protecting them until they can be broken down by hormone-sensitive lipase. The encoded protein is the major cAMP-dependent protein kinase substrate in adipocytes and, when unphosphorylated, may play a role in the inhibition of lipolysis. Alternatively spliced transcript variants varying in the 5’ UTR, but encoding the same protein, have been found for this gene. | 5346 | PLIN1 | perilipin 1 | ENSG00000166819 | NA |
| Members of the F-box protein family, such as FBXO27, are characterized by an approximately 40-amino acid F-box motif. SCF complexes, formed by SKP1 (MIM 601434), cullin (see CUL1; MIM 603134), and F-box proteins, act as protein-ubiquitin ligases. F-box proteins interact with SKP1 through the F box, and they interact with ubiquitination targets through other protein interaction domains (Jin et al., 2004 [PubMed 15520277]). | 126433 | FBXO27 | F-box protein 27 | ENSG00000161243 | NA |
| NA | 81691 | LOC81691 | exonuclease NEF-sp | ENSG00000005189 | NA |
| This gene encodes a low density lipoprotein receptor that belongs to the C-type lectin superfamily. This gene is regulated through the cyclic AMP signaling pathway. The encoded protein binds, internalizes and degrades oxidized low-density lipoprotein. This protein may be involved in the regulation of Fas-induced apoptosis. This protein may play a role as a scavenger receptor. Mutations of this gene have been associated with atherosclerosis, risk of myocardial infarction, and may modify the risk of Alzheimer’s disease. Alternate splicing results in multiple transcript variants. | 4973 | OLR1 | oxidized low density lipoprotein receptor 1 | ENSG00000173391 | NA |
| This gene encodes a member of the serine proteinase inhibitor (serpin) superfamily. This member is the principal inhibitor of tissue plasminogen activator (tPA) and urokinase (uPA), and hence is an inhibitor of fibrinolysis. Defects in this gene are the cause of plasminogen activator inhibitor-1 deficiency (PAI-1 deficiency), and high concentrations of the gene product are associated with thrombophilia. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 5054 | SERPINE1 | serpin family E member 1 | ENSG00000106366 | NA |
| Spectrins are principle components of a cell’s membrane-cytoskeleton and are composed of two alpha and two beta spectrin subunits. The protein encoded by this gene (SPTBN2), is called spectrin beta non-erythrocytic 2 or beta-III spectrin. It is related to, but distinct from, the beta-II spectrin gene which is also known as spectrin beta non-erythrocytic 1 (SPTBN1). SPTBN2 regulates the glutamate signaling pathway by stabilizing the glutamate transporter EAAT4 at the surface of the plasma membrane. Mutations in this gene cause a form of spinocerebellar ataxia, SCA5, that is characterized by neurodegeneration, progressive locomotor incoordination, dysarthria, and uncoordinated eye movements. | 6712 | SPTBN2 | spectrin beta, non-erythrocytic 2 | ENSG00000173898 | NA |
| The delta (HBD) and beta (HBB) genes are normally expressed in the adult: two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin. Two alpha chains plus two delta chains constitute HbA-2, which with HbF comprises the remaining 3% of adult hemoglobin. Five beta-like globin genes are found within a 45 kb cluster on chromosome 11 in the following order: 5’-epsilon–Ggamma–Agamma–delta–beta-3’. Mutations in the delta-globin gene are associated with beta-thalassemia. | 3045 | HBD | hemoglobin subunit delta | ENSG00000223609 | NA |
| NA | 78995 | C17orf53 | chromosome 17 open reading frame 53 | ENSG00000125319 | NA |
| NA | ENSG00000262001 | DLGAP1-AS2 | DLGAP1 antisense RNA 2 | ENSG00000262001 | NA |
| NA | ENSG00000267992 | CTB-189B5.3 | NA | ENSG00000267992 | NA |
| This gene is a member of the N-myc downregulated gene family which belongs to the alpha/beta hydrolase superfamily. The protein encoded by this gene is a cytoplasmic protein that is required for cell cycle progression and survival in primary astrocytes and may be involved in the regulation of mitogenic signalling in vascular smooth muscles cells. Alternative splicing results in multiple transcripts encoding different isoforms. | 65009 | NDRG4 | NDRG family member 4 | ENSG00000103034 | NA |
| NA | 9796 | PHYHIP | phytanoyl-CoA 2-hydroxylase interacting protein | ENSG00000168490 | NA |
| Members of the perilipin family, such as PLIN4, coat intracellular lipid storage droplets (Wolins et al., 2003 [PubMed 12840023]). | 729359 | PLIN4 | perilipin 4 | ENSG00000167676 | NA |
| NA | 80162 | ATHL1 | ATH1, acid trehalase-like 1 (yeast) | ENSG00000142102 | NA |
| This gene encodes one of several deubiquitylating enzymes. Ubiquitin modification of proteins is needed for their stability and function; to reverse the process, deubiquityling enzymes remove ubiquitin. This protein contains an OTU domain and binds Ubal (ubiquitin aldehyde); an active cysteine protease site is present in the OTU domain. | 78990 | OTUB2 | OTU deubiquitinase, ubiquitin aldehyde binding 2 | ENSG00000089723 | NA |
| NA | ENSG00000271857 | RP1-244F24.1 | NA | ENSG00000271857 | NA |
| NA | ENSG00000255507 | RP11-535A19.2 | NA | ENSG00000255507 | NA |
| The protein encoded by this gene is secreted and is a serine protease inhibitor whose targets include elastase, plasmin, thrombin, trypsin, chymotrypsin, and plasminogen activator. Defects in this gene can cause emphysema or liver disease. Several transcript variants encoding the same protein have been found for this gene. | 5265 | SERPINA1 | serpin family A member 1 | ENSG00000197249 | NA |
| The protein encoded by this gene is involved in both the synthesis and degradation of fructose-2,6-bisphosphate, a regulatory molecule that controls glycolysis in eukaryotes. The encoded protein has a 6-phosphofructo-2-kinase activity that catalyzes the synthesis of fructose-2,6-bisphosphate, and a fructose-2,6-biphosphatase activity that catalyzes the degradation of fructose-2,6-bisphosphate. This protein regulates fructose-2,6-bisphosphate levels in the heart, while a related enzyme encoded by a different gene regulates fructose-2,6-bisphosphate levels in the liver and muscle. This enzyme functions as a homodimer. Two transcript variants encoding two different isoforms have been found for this gene. | 5208 | PFKFB2 | 6-phosphofructo-2-kinase/fructose-2,6-biphosphatase 2 | ENSG00000123836 | NA |
| The protein encoded by this gene plays a key role in the acute regulation of steroid hormone synthesis by enhancing the conversion of cholesterol into pregnenolone. This protein permits the cleavage of cholesterol into pregnenolone by mediating the transport of cholesterol from the outer mitochondrial membrane to the inner mitochondrial membrane. Mutations in this gene are a cause of congenital lipoid adrenal hyperplasia (CLAH), also called lipoid CAH. A pseudogene of this gene is located on chromosome 13. | 6770 | STAR | steroidogenic acute regulatory protein | ENSG00000147465 | NA |
| Lactic acid and pyruvate transport across plasma membranes is catalyzed by members of the proton-linked monocarboxylate transporter (MCT) family, which has been designated solute carrier family-16. Each MCT appears to have slightly different substrate and inhibitor specificities and transport kinetics, which are related to the metabolic requirements of the tissues in which it is found. The MCTs, which include MCT1 (SLC16A1; MIM 600682) and MCT2 (SLC16A7; MIM 603654), are characterized by 12 predicted transmembrane domains (Price et al., 1998 [PubMed 9425115]). | 9123 | SLC16A3 | solute carrier family 16 member 3 | ENSG00000141526 | NA |
| NA | 84518 | CNFN | cornifelin | ENSG00000105427 | NA |
| Myosins are actin-based motor proteins that function in the generation of mechanical force in eukaryotic cells. Muscle myosins are heterohexamers composed of 2 myosin heavy chains and 2 pairs of nonidentical myosin light chains. This gene encodes a member of the class II or conventional myosin heavy chains, and functions in skeletal muscle contraction. This gene is found in a cluster of myosin heavy chain genes on chromosome 17. A mutation in this gene results in inclusion body myopathy-3. Multiple alternatively spliced variants, encoding the same protein, have been identified. | 4620 | MYH2 | myosin, heavy chain 2, skeletal muscle, adult | ENSG00000125414 | NA |
| NA | 84886 | C1orf198 | chromosome 1 open reading frame 198 | ENSG00000119280 | NA |
| The yeast heterotetrameric GINS complex is made up of Sld5 (GINS4; MIM 610611), Psf1, Psf2 (GINS2; MIM 610609), and Psf3 (GINS3; MIM 610610). The formation of the GINS complex is essential for the initiation of DNA replication in yeast and Xenopus egg extracts (Ueno et al., 2005 [PubMed 16287864]). | 9837 | GINS1 | GINS complex subunit 1 | ENSG00000101003 | NA |
| NA | ENSG00000232093 | RP11-307C12.11 | NA | ENSG00000232093 | NA |
| NA | ENSG00000219435 | TEX40 | testis expressed 40 | ENSG00000219435 | NA |
| NA | 143903 | LAYN | layilin | ENSG00000204381 | NA |
| This gene encodes a cytoskeletal LIM protein that binds to actin filaments via a domain that is homologous to erythrocyte dematin. LIM domains, found in over 60 proteins, play key roles in the regulation of developmental pathways. LIM domains also function as protein-binding interfaces, mediating specific protein-protein interactions. The protein encoded by this gene could mediate such interactions between actin filaments and cytoplasmic targets. Alternatively spliced transcript variants encoding different isoforms have been identified. | 3983 | ABLIM1 | actin binding LIM protein 1 | ENSG00000099204 | NA |
| This gene encodes a protease that removes the N-terminal peroxisomal targeting signal (PTS2) from proteins produced in the cytosol, thereby facilitating their import into the peroxisome. The encoded protein is also capable of removing the C-terminal peroxisomal targeting signal (PTS1) from proteins in the peroxisomal matrix. The full-length protein undergoes self-cleavage to produce shorter, potentially inactive, peptides. Alternative splicing results in multiple transcript variants for this gene. | 219743 | TYSND1 | trypsin domain containing 1 | ENSG00000156521 | NA |
| This gene encodes a protein subunit of the GINS heterotetrameric complex, which is essential for the initiation of DNA replication and replisome progression in eukaryotes. Alternatively spliced transcript variants encoding distinct isoforms have been described. | 64785 | GINS3 | GINS complex subunit 3 | ENSG00000181938 | NA |
| This gene encodes a protein containing a calponin homology (CH) domain, a PDZ domain, and a LIM domain, and may be involved in protein-protein interactions. Several alternatively spliced transcript variants encoding different isoforms have been found for this gene, however, the full-length nature of some variants is not known. | 4008 | LMO7 | LIM domain 7 | ENSG00000136153 | NA |
| NA | ENSG00000222112 | RN7SKP16 | RNA, 7SK small nuclear pseudogene 16 | ENSG00000222112 | NA |
| The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | 3039 | HBA1 | hemoglobin subunit alpha 1 | ENSG00000206172 | NA |
| Myosin is a hexameric ATPase cellular motor protein. It is composed of two heavy chains, two nonphosphorylatable alkali light chains, and two phosphorylatable regulatory light chains. This gene encodes a myosin alkali light chain expressed in fast skeletal muscle. Two transcript variants have been identified for this gene. | 4632 | MYL1 | myosin light chain 1 | ENSG00000168530 | NA |
| NA | 84793 | FOXD2-AS1 | FOXD2 antisense RNA 1 (head to head) | ENSG00000237424 | NA |
| NA | ENSG00000267379 | CTC-548K16.5 | NA | ENSG00000267379 | NA |
| The protein encoded by this gene is a component of the SMAD pathway, which regulates cell growth and differentiation through transforming growth factor-beta (TGFB). In the absence of ligand, the encoded protein binds to the promoter region of TGFB-responsive genes and recruits a nuclear repressor complex. TGFB signaling causes SMAD3 to enter the nucleus and degrade this protein, allowing these genes to be activated. Four transcript variants encoding three different isoforms have been found for this gene. | 6498 | SKIL | SKI-like proto-oncogene | ENSG00000136603 | NA |
| NA | ENSG00000253392 | AC006277.2 | NA | ENSG00000253392 | NA |
| The protein encoded by this gene has a long and a short form, generated by use of alternative translational start codons. The long form is expressed in steroidogenic tissues such as testis, where it converts cholesteryl esters to free cholesterol for steroid hormone production. The short form is expressed in adipose tissue, among others, where it hydrolyzes stored triglycerides to free fatty acids. | 3991 | LIPE | lipase E, hormone sensitive type | ENSG00000079435 | NA |
| The protein encoded by this gene belongs to the ‘regulator of G protein signaling’ family. It inhibits signal transduction by increasing the GTPase activity of G protein alpha subunits. It also may play a role in regulating the kinetics of signaling in the phototransduction cascade. | 6004 | RGS16 | regulator of G-protein signaling 16 | ENSG00000143333 | NA |
| This gene encodes the cytosolic form of serine hydroxymethyltransferase, a pyridoxal phosphate-containing enzyme that catalyzes the reversible conversion of serine and tetrahydrofolate to glycine and 5,10-methylene tetrahydrofolate. This reaction provides one-carbon units for synthesis of methionine, thymidylate, and purines in the cytoplasm. This gene is located within the Smith-Magenis syndrome region on chromosome 17. A pseudogene of this gene is located on the short arm of chromosome 1. Alternative splicing results in multiple transcript variants. | 6470 | SHMT1 | serine hydroxymethyltransferase 1 | ENSG00000176974 | NA |
| This gene encodes a member of the transforming growth factor beta (TGFB) family of cytokines, which are multifunctional peptides that regulate proliferation, differentiation, adhesion, migration, and other functions in many cell types by transducing their signal through combinations of transmembrane type I and type II receptors (TGFBR1 and TGFBR2) and their downstream effectors, the SMAD proteins. Disruption of the TGFB/SMAD pathway has been implicated in a variety of human cancers. The encoded protein is secreted and has suppressive effects of interleukin-2 dependent T-cell growth. Translocation t(1;7)(q41;p21) between this gene and HDAC9 is associated with Peters’ anomaly, a congenital defect of the anterior chamber of the eye. The knockout mice lacking this gene show perinatal mortality and a wide range of developmental, including cardiac, defects. Alternatively spliced transcript variants encoding different isoforms have been identified. | 7042 | TGFB2 | transforming growth factor beta 2 | ENSG00000092969 | NA |
| NA | NA | NA | NA | ENSG00000256545 | TRUE |
| FAAP24 is a component of the Fanconi anemia (FA) core complex (see MIM 227650), which plays a crucial role in DNA damage response (Ciccia et al., 2007 [PubMed 17289582]). | 91442 | FAAP24 | Fanconi anemia core complex associated protein 24 | ENSG00000131944 | NA |
| NA | ENSG00000256462 | RP11-116G8.5 | NA | ENSG00000256462 | NA |
| NA | 65985 | AACS | acetoacetyl-CoA synthetase | ENSG00000081760 | NA |
| This gene encodes a member of the TGF-beta family of proteins. The encoded protein is secreted and is involved in embryogenesis and cell differentiation. Defects in this gene are a cause of familial arrhythmogenic right ventricular dysplasia 1. | 7043 | TGFB3 | transforming growth factor beta 3 | ENSG00000119699 | NA |
| This antimicrobial gene encodes a member of the CXC subfamily of chemokines. The encoded protein is a secreted growth factor that signals through the G-protein coupled receptor, CXC receptor 2. This protein plays a role in inflammation and as a chemoattractant for neutrophils. Aberrant expression of this protein is associated with the growth and progression of certain tumors. A naturally occurring processed form of this protein has increased chemotactic activity. Alternate splicing results in coding and non-coding variants of this gene. A pseudogene of this gene is found on chromosome 4. | 2919 | CXCL1 | C-X-C motif chemokine ligand 1 | ENSG00000163739 | NA |
| This gene is one of several cytokine genes clustered on the q-arm of chromosome 17. Chemokines are a superfamily of secreted proteins involved in immunoregulatory and inflammatory processes. The superfamily is divided into four subfamilies based on the arrangement of N-terminal cysteine residues of the mature peptide. This chemokine is a member of the CC subfamily which is characterized by two adjacent cysteine residues. This cytokine displays chemotactic activity for monocytes and basophils but not for neutrophils or eosinophils. It has been implicated in the pathogenesis of diseases characterized by monocytic infiltrates, like psoriasis, rheumatoid arthritis and atherosclerosis. It binds to chemokine receptors CCR2 and CCR4. | 6347 | CCL2 | C-C motif chemokine ligand 2 | ENSG00000108691 | NA |
| NA | ENSG00000254272 | RP11-382J24.2 | NA | ENSG00000254272 | NA |
| This gene encodes a type I transmembrane protein that is localized to junctional complexes between endothelial and epithelial cells and may have a role in cell-cell adhesion. Expression of this gene in white adipose tissue is implicated in adipocyte maturation and development of obesity. This gene is also essential for normal intestinal development and mutations in the gene are associated with congenital short bowel syndrome. | 79827 | CLMP | CXADR-like membrane protein | ENSG00000166250 | NA |
| NA | 113146 | AHNAK2 | AHNAK nucleoprotein 2 | ENSG00000185567 | NA |
| This gene is a member of the visinin/recoverin subfamily of neuronal calcium sensor proteins. The encoded protein is strongly expressed in granule cells of the cerebellum where it associates with membranes in a calcium-dependent manner and modulates intracellular signaling pathways of the central nervous system by directly or indirectly regulating the activity of adenylyl cyclase. Alternatively spliced transcript variants have been observed, but their full-length nature has not been determined. | 7447 | VSNL1 | visinin like 1 | ENSG00000163032 | NA |
| This gene encodes a member of the desmocollin protein subfamily. Desmocollins, along with desmogleins, are cadherin-like transmembrane glycoproteins that are major components of the desmosome. Desmosomes are cell-cell junctions that help resist shearing forces and are found in high concentrations in cells subject to mechanical stress. This gene is found in a cluster with other desmocollin family members on chromosome 18. Mutations in this gene are associated with arrhythmogenic right ventricular dysplasia-11, and reduced protein expression has been described in several types of cancer. Alternative splicing results in multiple transcript variants. | 1824 | DSC2 | desmocollin 2 | ENSG00000134755 | NA |
| This gene is a member of the immunoglobulin superfamily. The encoded poly-Ig receptor binds polymeric immunoglobulin molecules at the basolateral surface of epithelial cells; the complex is then transported across the cell to be secreted at the apical surface. A significant association was found between immunoglobulin A nephropathy and several SNPs in this gene. | 5284 | PIGR | polymeric immunoglobulin receptor | ENSG00000162896 | NA |
| This gene encodes the receptor for urokinase plasminogen activator and, given its role in localizing and promoting plasmin formation, likely influences many normal and pathological processes related to cell-surface plasminogen activation and localized degradation of the extracellular matrix. It binds both the proprotein and mature forms of urokinase plasminogen activator and permits the activation of the receptor-bound pro-enzyme by plasmin. The protein lacks transmembrane or cytoplasmic domains and may be anchored to the plasma membrane by a glycosyl-phosphatidylinositol (GPI) moiety following cleavage of the nascent polypeptide near its carboxy-terminus. However, a soluble protein is also produced in some cell types. Alternative splicing results in multiple transcript variants encoding different isoforms. The proprotein experiences several post-translational cleavage reactions that have not yet been fully defined. | 5329 | PLAUR | plasminogen activator, urokinase receptor | ENSG00000011422 | NA |
| Defensins are a family of antimicrobial and cytotoxic peptides thought to be involved in host defense. They are abundant in the granules of neutrophils and also found in the epithelia of mucosal surfaces such as those of the intestine, respiratory tract, urinary tract, and vagina. Members of the defensin family are highly similar in protein sequence and distinguished by a conserved cysteine motif. The protein encoded by this gene, defensin, alpha 3, is found in the microbicidal granules of neutrophils and likely plays a role in phagocyte-mediated host defense. Several alpha defensin genes are clustered on chromosome 8. This gene differs from defensin, alpha 1 by only one amino acid. This gene and the gene encoding defensin, alpha 1 are both subject to copy number variation. | 1668 | DEFA3 | defensin alpha 3 | ENSG00000239839 | NA |
| Defensins are a family of antimicrobial and cytotoxic peptides thought to be involved in host defense. They are abundant in the granules of neutrophils and also found in the epithelia of mucosal surfaces such as those of the intestine, respiratory tract, urinary tract, and vagina. Members of the defensin family are highly similar in protein sequence and distinguished by a conserved cysteine motif. The protein encoded by this gene, defensin, alpha 1, is found in the microbicidal granules of neutrophils and likely plays a role in phagocyte-mediated host defense. Several alpha defensin genes are clustered on chromosome 8. This gene differs from defensin, alpha 3 by only one amino acid. This gene and the gene encoding defensin, alpha 3 are both subject to copy number variation. | 1667 | DEFA1 | defensin alpha 1 | ENSG00000239839 | NA |
| Defensins are a family of antimicrobial and cytotoxic peptides thought to be involved in host defense. They are abundant in the granules of neutrophils and also found in the epithelia of mucosal surfaces such as those of the intestine, respiratory tract, urinary tract, and vagina. Members of the defensin family are highly similar in protein sequence and distinguished by a conserved cysteine motif. The protein encoded by this gene, defensin, alpha 1, is found in the microbicidal granules of neutrophils and likely plays a role in phagocyte-mediated host defense. Several alpha defensin genes are clustered on chromosome 8. This gene differs from defensin, alpha 3 by only one amino acid. This gene and the gene encoding defensin, alpha 3 are both subject to copy number variation. Two transcript variants encoding different isoforms have been found for this gene. | 728358 | DEFA1B | defensin alpha 1B | ENSG00000239839 | NA |
| NA | 80336 | PABPC1L | poly(A) binding protein cytoplasmic 1 like | ENSG00000101104 | NA |
| Troponin proteins associate with tropomyosin and regulate the calcium sensitivity of the myofibril contractile apparatus of striated muscles. Troponin I (TnI), along with troponin T (TnT) and troponin C (TnC), is one of 3 subunits that form the troponin complex of the thin filaments of striated muscle. TnI is the inhibitory subunit; blocking actin-myosin interactions and thereby mediating striated muscle relaxation. The TnI subfamily contains three genes: TnI-skeletal-fast-twitch, TnI-skeletal-slow-twitch, and TnI-cardiac. The TnI-fast and TnI-slow genes are expressed in fast-twitch and slow-twitch skeletal muscle fibers, respectively, while the TnI-cardiac gene is expressed exclusively in cardiac muscle tissue. This gene encodes the Troponin-I-skeletal-slow-twitch protein. This gene is expressed in cardiac and skeletal muscle during early development but is restricted to slow-twitch skeletal muscle fibers in adults. The encoded protein prevents muscle contraction by inhibiting calcium-mediated conformational changes in actin-myosin complexes. | 7135 | TNNI1 | troponin I1, slow skeletal type | ENSG00000159173 | NA |
| NA | 65989 | DLK2 | delta like non-canonical Notch ligand 2 | ENSG00000171462 | NA |
| NA | ENSG00000230530 | LIMD1-AS1 | LIMD1 antisense RNA 1 | ENSG00000230530 | NA |
| This gene encodes a member of the epidermal growth factor (EGF) receptor family of receptor tyrosine kinases. This protein has no ligand binding domain of its own and therefore cannot bind growth factors. However, it does bind tightly to other ligand-bound EGF receptor family members to form a heterodimer, stabilizing ligand binding and enhancing kinase-mediated activation of downstream signalling pathways, such as those involving mitogen-activated protein kinase and phosphatidylinositol-3 kinase. Allelic variations at amino acid positions 654 and 655 of isoform a (positions 624 and 625 of isoform b) have been reported, with the most common allele, Ile654/Ile655, shown here. Amplification and/or overexpression of this gene has been reported in numerous cancers, including breast and ovarian tumors. Alternative splicing results in several additional transcript variants, some encoding different isoforms and others that have not been fully characterized. | 2064 | ERBB2 | erb-b2 receptor tyrosine kinase 2 | ENSG00000141736 | NA |
| This gene encodes a member of the EGF-TM7 subfamily of adhesion G protein-coupled receptors, which mediate cell-cell interactions. These proteins are cleaved by self-catalytic proteolysis into a large extracellular subunit and seven-span transmembrane subunit, which associate at the cell surface as a receptor complex. The encoded protein may play a role in cell adhesion as well as leukocyte recruitment, activation and migration, and contains multiple extracellular EGF-like repeats which mediate binding to chondroitin sulfate and the cell surface complement regulatory protein CD55. Expression of this gene may play a role in the progression of several types of cancer. Alternatively spliced transcript variants encoding multiple isoforms with 3 to 5 EGF-like repeats have been observed for this gene. This gene is found in a cluster with other EGF-TM7 genes on the short arm of chromosome 19. | 976 | ADGRE5 | adhesion G protein-coupled receptor E5 | ENSG00000123146 | NA |
| NA | NA | NA | NA | ENSG00000256005 | TRUE |
| This gene encodes one of the six subunits of type IV collagen, the major structural component of basement membranes. This particular collagen IV subunit, however, is only found in a subset of basement membranes. Like the other members of the type IV collagen gene family, this gene is organized in a head-to-head conformation with another type IV collagen gene so that each gene pair shares a common promoter. Mutations in this gene are associated with type II autosomal recessive Alport syndrome (hereditary glomerulonephropathy) and with familial benign hematuria (thin basement membrane disease). Two transcripts, differing only in their transcription start sites, have been identified for this gene and, as is common for collagen genes, multiple polyadenylation sites are found in the 3’ UTR. | 1286 | COL4A4 | collagen type IV alpha 4 chain | ENSG00000081052 | NA |
| NA | 57467 | HHATL | hedgehog acyltransferase-like | ENSG00000010282 | NA |
| This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | 5967 | REG1A | regenerating family member 1 alpha | ENSG00000115386 | NA |
| NA | 55659 | ZNF416 | zinc finger protein 416 | ENSG00000083817 | NA |
| Neurogranin (NRGN) is the human homolog of the neuron-specific rat RC3/neurogranin gene. This gene encodes a postsynaptic protein kinase substrate that binds calmodulin in the absence of calcium. The NRGN gene contains four exons and three introns. The exons 1 and 2 encode the protein and exons 3 and 4 contain untranslated sequences. It is suggested that the NRGN is a direct target for thyroid hormone in human brain, and that control of expression of this gene could underlay many of the consequences of hypothyroidism on mental states during development as well as in adult subjects. | 4900 | NRGN | neurogranin | ENSG00000154146 | NA |
| This gene encodes a member of the semicarbazide-sensitive amine oxidase family. Copper amine oxidases catalyze the oxidative conversion of amines to aldehydes in the presence of copper and quinone cofactor. The encoded protein is localized to the cell surface, has adhesive properties as well as monoamine oxidase activity, and may be involved in leukocyte trafficking. Alterations in levels of the encoded protein may be associated with many diseases, including diabetes mellitus. A pseudogene of this gene has been described and is located approximately 9-kb downstream on the same chromosome. Alternative splicing results in multiple transcript variants. | 8639 | AOC3 | amine oxidase, copper containing 3 | ENSG00000131471 | NA |
| Acetyl-CoA carboxylase (ACC) is a complex multifunctional enzyme system. ACC is a biotin-containing enzyme which catalyzes the carboxylation of acetyl-CoA to malonyl-CoA, the rate-limiting step in fatty acid synthesis. ACC-beta is thought to control fatty acid oxidation by means of the ability of malonyl-CoA to inhibit carnitine-palmitoyl-CoA transferase I, the rate-limiting step in fatty acid uptake and oxidation by mitochondria. ACC-beta may be involved in the regulation of fatty acid oxidation, rather than fatty acid biosynthesis. There is evidence for the presence of two ACC-beta isoforms. | 32 | ACACB | acetyl-CoA carboxylase beta | ENSG00000076555 | NA |
| This gene encodes a member of the cytokine family. The protein contains a tyrosine sulfation site, 3 potential N-myristoylation sites, multiple putative phosphorylation sites, and an RGD cell-attachment sequence. Expression of this protein is increased after the activation of T-cells by mitogens or the activation of NK cells by IL-2. This protein induces the production of TNFalpha from macrophage cells. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. | 9235 | IL32 | interleukin 32 | ENSG00000008517 | NA |
| Guanine nucleotide binding proteins are heterotrimeric signal-transducing molecules consisting of alpha, beta, and gamma subunits. The alpha subunit binds guanine nucleotide, can hydrolyze GTP, and can interact with other proteins. The protein encoded by this gene represents the alpha subunit of an inhibitory complex. The encoded protein is part of a complex that responds to beta-adrenergic signals by inhibiting adenylate cyclase. Two transcript variants encoding different isoforms have been found for this gene. | 2770 | GNAI1 | G protein subunit alpha i1 | ENSG00000127955 | NA |
| NA | ENSG00000177337 | DLGAP1-AS1 | DLGAP1 antisense RNA 1 | ENSG00000177337 | NA |
| Mutations in the Schizosaccharomyces pombe Rae1 and Saccharomyces cerevisiae Gle2 genes have been shown to result in accumulation of poly(A)-containing mRNA in the nucleus, suggesting that the encoded proteins are involved in RNA export. The protein encoded by this gene is a homolog of yeast Rae1. It contains four WD40 motifs, and has been shown to localize to distinct foci in the nucleoplasm, to the nuclear rim, and to meshwork-like structures throughout the cytoplasm. This gene is thought to be involved in nucleocytoplasmic transport, and in directly or indirectly attaching cytoplasmic mRNPs to the cytoskeleton. Alternatively spliced transcript variants encoding the same protein have been found for this gene. | 8480 | RAE1 | ribonucleic acid export 1 | ENSG00000101146 | NA |
| NA | ENSG00000267249 | RP11-973H7.3 | NA | ENSG00000267249 | NA |
| NA | ENSG00000269463 | RP11-727F15.13 | NA | ENSG00000269463 | NA |
| Histones are basic nuclear proteins that are responsible for the nucleosome structure of the chromosomal fiber in eukaryotes. Two molecules of each of the four core histones (H2A, H2B, H3, and H4) form an octamer, around which approximately 146 bp of DNA is wrapped in repeating units, called nucleosomes. The linker histone, H1, interacts with linker DNA between nucleosomes and functions in the compaction of chromatin into higher order structures. This gene is intronless and encodes a replication-dependent histone that is a member of the histone H4 family. Transcripts from this gene lack polyA tails but instead contain a palindromic termination element. This gene is found in the large histone gene cluster on chromosome 6. | 8365 | HIST1H4H | histone cluster 1, H4h | ENSG00000158406 | NA |
| This gene encodes a component of vacuolar ATPase (V-ATPase), a multisubunit enzyme that mediates acidification of eukaryotic intracellular organelles. V-ATPase dependent organelle acidification is necessary for such intracellular processes as protein sorting, zymogen activation, receptor-mediated endocytosis, and synaptic vesicle proton gradient generation. V-ATPase is composed of a cytosolic V1 domain and a transmembrane V0 domain. The V1 domain consists of three A,three B, and two G subunits, as well as a C, D, E, F, and H subunit. The V1 domain contains the ATP catalytic site. This gene encodes alternate transcriptional splice variants, encoding different V1 domain C subunit isoforms. | 245973 | ATP6V1C2 | ATPase H+ transporting V1 subunit C2 | ENSG00000143882 | NA |
| This gene encodes a member of the type 3 G protein-coupling receptor family, characterized by the signature 7-transmembrane domain motif. The encoded protein may be involved in interaction between retinoid acid and G protein signalling pathways. Retinoic acid plays a critical role in development, cellular growth, and differentiation. This gene may play a role in embryonic development and epithelial cell differentiation. | 9052 | GPRC5A | G protein-coupled receptor class C group 5 member A | ENSG00000013588 | NA |
| The protein encoded by this gene belongs to a family of proteins thought to play a role in the exocytosis of synaptic vesicles. Vesicle exocytosis releases vesicular contents and is important to various cellular functions. For instance, the secretion of transmitters from neurons plays an important role in synaptic transmission. After exocytosis, the membrane and proteins from the vesicle are retrieved from the plasma membrane through the process of endocytosis. Mutations in this gene have been identified as one cause of fever-associated epilepsy syndromes. A possible link between this gene and Parkinson’s disease has also been suggested. | 112755 | STX1B | syntaxin 1B | ENSG00000099365 | NA |
| NA | ENSG00000262251 | RP11-199F11.2 | NA | ENSG00000262251 | NA |
| This gene encodes the pulmonary-associated surfactant protein C (SPC), an extremely hydrophobic surfactant protein essential for lung function and homeostasis after birth. Pulmonary surfactant is a surface-active lipoprotein complex composed of 90% lipids and 10% proteins which include plasma proteins and apolipoproteins SPA, SPB, SPC and SPD. The surfactant is secreted by the alveolar cells of the lung and maintains the stability of pulmonary tissue by reducing the surface tension of fluids that coat the lung. Multiple mutations in this gene have been identified, which cause pulmonary surfactant metabolism dysfunction type 2, also called pulmonary alveolar proteinosis due to surfactant protein C deficiency, and are associated with interstitial lung disease in older infants, children, and adults. Alternatively spliced transcript variants encoding different protein isoforms have been identified. | 6440 | SFTPC | surfactant protein C | ENSG00000168484 | NA |
| NA | 79000 | AUNIP | aurora kinase A and ninein interacting protein | ENSG00000127423 | NA |
| NA | 253635 | GPATCH11 | G-patch domain containing 11 | ENSG00000152133 | NA |
| NA | 27106 | ARRDC2 | arrestin domain containing 2 | ENSG00000105643 | NA |
| NA | 124976 | SPNS2 | sphingolipid transporter 2 | ENSG00000183018 | NA |
| The gamma globin genes (HBG1 and HBG2) are normally expressed in the fetal liver, spleen and bone marrow. Two gamma chains together with two alpha chains constitute fetal hemoglobin (HbF) which is normally replaced by adult hemoglobin (HbA) at birth. In some beta-thalassemias and related conditions, gamma chain production continues into adulthood. The two types of gamma chains differ at residue 136 where glycine is found in the G-gamma product (HBG2) and alanine is found in the A-gamma product (HBG1). The former is predominant at birth. The order of the genes in the beta-globin cluster is: 5’- epsilon – gamma-G – gamma-A – delta – beta–3’. | 3048 | HBG2 | hemoglobin subunit gamma 2 | ENSG00000196565 | NA |
| The protein encoded by this gene is an adenosine receptor that belongs to the G-protein coupled receptor 1 family. There are 3 types of adenosine receptors, each with a specific pattern of ligand binding and tissue distribution, and together they regulate a diverse set of physiologic functions. The type A1 receptors inhibit adenylyl cyclase, and play a role in the fertilization process. Animal studies also suggest a role for A1 receptors in kidney function and ethanol intoxication. Transcript variants with alternative splicing in the 5’ UTR have been found for this gene. | 134 | ADORA1 | adenosine A1 receptor | ENSG00000163485 | NA |
| NA | 441478 | NRARP | NOTCH-regulated ankyrin repeat protein | ENSG00000198435 | NA |
| NA | 151246 | SGO2 | shugoshin 2 | ENSG00000163535 | NA |
| This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the mitochondrial inner membrane and is involved in the conversion of progesterone to cortisol in the adrenal cortex. Mutations in this gene cause congenital adrenal hyperplasia due to 11-beta-hydroxylase deficiency. Transcript variants encoding different isoforms have been noted for this gene. | 1584 | CYP11B1 | cytochrome P450 family 11 subfamily B member 1 | ENSG00000160882 | NA |
| Histones are basic nuclear proteins that are responsible for the nucleosome structure of the chromosomal fiber in eukaryotes. This structure consists of approximately 146 bp of DNA wrapped around a nucleosome, an octamer composed of pairs of each of the four core histones (H2A, H2B, H3, and H4). The chromatin fiber is further compacted through the interaction of a linker histone, H1, with the DNA between the nucleosomes to form higher order chromatin structures. This gene encodes a replication-dependent histone that is a member of the histone H2B family and is found in a histone cluster on chromosome 1. | 440689 | HIST2H2BF | histone cluster 2, H2bf | ENSG00000203814 | NA |
| NA | NA | NA | NA | ENSG00000156750 | TRUE |
| This gene encodes a member of the tyrosine phosphatase family of proteins and exhibits dual specificity by dephosphorylating tyrosine as well as serine and threonine residues. This gene has been described as both a tumor suppressor and an oncogene depending on the cellular context. This protein may regulate neuronal proliferation and has been implicated in the progression of glioblastoma through its ability to dephosphorylate the p53 tumor suppressor. Alternative splicing results in multiple transcript variants. | 78986 | DUSP26 | dual specificity phosphatase 26 (putative) | ENSG00000133878 | NA |
| The Rab subfamily of small GTPases plays an important role in the regulation of membrane trafficking. RAB17 is an epithelial cell-specific GTPase (Lutcke et al., 1993 [PubMed 8486736]). | 64284 | RAB17 | RAB17, member RAS oncogene family | ENSG00000124839 | NA |
| This gene encodes a member of the steroid-thyroid hormone-retinoid receptor superfamily. The encoded protein may act as a transcriptional activator. The protein can efficiently bind the NGFI-B Response Element (NBRE). Three different versions of extraskeletal myxoid chondrosarcomas (EMCs) are the result of reciprocal translocations between this gene and other genes. The translocation breakpoints are associated with Nuclear Receptor Subfamily 4, Group A, Member 3 (on chromosome 9) and either Ewing Sarcome Breakpoint Region 1 (on chromosome 22), RNA Polymerase II, TATA Box-Binding Protein-Associated Factor, 68-KD (on chromosome 17), or Transcription factor 12 (on chromosome 15). Multiple transcript variants encoding different isoforms have been found for this gene. | 8013 | NR4A3 | nuclear receptor subfamily 4 group A member 3 | ENSG00000119508 | NA |
| The human alpha globin gene cluster located on chromosome 16 spans about 30 kb and includes seven loci: 5’- zeta - pseudozeta - mu - pseudoalpha-1 - alpha-2 - alpha-1 - theta - 3’. The alpha-2 (HBA2) and alpha-1 (HBA1) coding sequences are identical. These genes differ slightly over the 5’ untranslated regions and the introns, but they differ significantly over the 3’ untranslated regions. Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemias result from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1; some nondeletion alpha thalassemias have also been reported. | 3040 | HBA2 | hemoglobin subunit alpha 2 | ENSG00000188536 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",15,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[16,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | query | name | X_id | summary | notfound |
|---|---|---|---|---|---|
| IGLC3 | ENSG00000211679 | immunoglobulin lambda constant 3 (Kern-Oz+ marker) | ENSG00000211679 | NA | NA |
| IGHM | ENSG00000211899 | immunoglobulin heavy constant mu | ENSG00000211899 | NA | NA |
| IGLC1 | ENSG00000211675 | immunoglobulin lambda constant 1 (Mcg marker) | ENSG00000211675 | NA | NA |
| IGLL5 | ENSG00000254709 | immunoglobulin lambda like polypeptide 5 | 100423062 | This gene encodes one of the immunoglobulin lambda-like polypeptides. It is located within the immunoglobulin lambda locus but it does not require somatic rearrangement for expression. The first exon of this gene is unrelated to immunoglobulin variable genes; the second and third exons are the immunoglobulin lambda joining 1 and the immunoglobulin lambda constant 1 gene segments. Alternative splicing results in multiple transcript variants. | NA |
| IGLC2 | ENSG00000211677 | immunoglobulin lambda constant 2 (Kern-Oz- marker) | ENSG00000211677 | NA | NA |
| IGHA1 | ENSG00000211895 | immunoglobulin heavy constant alpha 1 | ENSG00000211895 | NA | NA |
| IGHA2 | ENSG00000211890 | immunoglobulin heavy constant alpha 2 (A2m marker) | ENSG00000211890 | NA | NA |
| SIGLEC10 | ENSG00000142512 | sialic acid binding Ig like lectin 10 | 89790 | SIGLECs are members of the immunoglobulin superfamily that are expressed on the cell surface. Most SIGLECs have 1 or more cytoplasmic immune receptor tyrosine-based inhibitory motifs, or ITIMs. SIGLECs are typically expressed on cells of the innate immune system, with the exception of the B-cell expressed SIGLEC6 (MIM 604405). | NA |
| CTD-2616J11.3 | ENSG00000254760 | NA | ENSG00000254760 | NA | NA |
| CTD-2616J11.2 | ENSG00000255441 | NA | ENSG00000255441 | NA | NA |
| JCHAIN | ENSG00000132465 | joining chain of multimeric IgA and IgM | 3512 | NA | NA |
| CTSS | ENSG00000163131 | cathepsin S | 1520 | The protein encoded by this gene, a member of the peptidase C1 family, is a lysosomal cysteine proteinase that may participate in the degradation of antigenic proteins to peptides for presentation on MHC class II molecules. The encoded protein can function as an elastase over a broad pH range in alveolar macrophages. Alternatively spliced transcript variants encoding distinct isoforms have been found for this gene. | NA |
| SERPINA1 | ENSG00000197249 | serpin family A member 1 | 5265 | The protein encoded by this gene is secreted and is a serine protease inhibitor whose targets include elastase, plasmin, thrombin, trypsin, chymotrypsin, and plasminogen activator. Defects in this gene can cause emphysema or liver disease. Several transcript variants encoding the same protein have been found for this gene. | NA |
| APOBR | ENSG00000184730 | apolipoprotein B receptor | 55911 | Apolipoprotein B48 receptor is a macrophage receptor that binds to the apolipoprotein B48 of dietary triglyceride (TG)-rich lipoproteins. This receptor may provide essential lipids, lipid-soluble vitamins and other nutrients to reticuloendothelial cells. If overwhelmed with elevated plasma triglyceride, the apolipoprotein B48 receptor may contribute to foam cell formation, endothelial dysfunction, and atherothrombogenesis. | NA |
| CYP2S1 | ENSG00000167600 | cytochrome P450 family 2 subfamily S member 1 | 29785 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum. In rodents, the homologous protein has been shown to metabolize certain carcinogens; however, the specific function of the human protein has not been determined. | NA |
| CTB-171A8.1 | ENSG00000266903 | NA | ENSG00000266903 | NA | NA |
| RP11-731F5.2 | ENSG00000253364 | NA | ENSG00000253364 | NA | NA |
| IGHG3 | ENSG00000211897 | immunoglobulin heavy constant gamma 3 (G3m marker) | ENSG00000211897 | NA | NA |
| LGALS4 | ENSG00000171747 | galectin 4 | 3960 | The galectins are a family of beta-galactoside-binding proteins implicated in modulating cell-cell and cell-matrix interactions. The expression of this gene is restricted to small intestine, colon, and rectum, and it is underexpressed in colorectal cancer. | NA |
| TBXAS1 | ENSG00000059377 | thromboxane A synthase 1 | 6916 | This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. However, this protein is considered a member of the cytochrome P450 superfamily on the basis of sequence similarity rather than functional similarity. This endoplasmic reticulum membrane protein catalyzes the conversion of prostglandin H2 to thromboxane A2, a potent vasoconstrictor and inducer of platelet aggregation. The enzyme plays a role in several pathophysiological processes including hemostasis, cardiovascular disease, and stroke. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| IGHG1 | ENSG00000211896 | immunoglobulin heavy constant gamma 1 (G1m marker) | ENSG00000211896 | NA | NA |
| PLAC8 | ENSG00000145287 | placenta specific 8 | 51316 | NA | NA |
| IGHG2 | ENSG00000211893 | immunoglobulin heavy constant gamma 2 (G2m marker) | ENSG00000211893 | NA | NA |
| KRT1 | ENSG00000167768 | keratin 1 | 3848 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the spinous and granular layers of the epidermis with family member KRT10 and mutations in these genes have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. | NA |
| CLDN7 | ENSG00000181885 | claudin 7 | 1366 | This gene encodes a member of the claudin family. Claudins are integral membrane proteins and components of tight junction strands. Tight junction strands serve as a physical barrier to prevent solutes and water from passing freely through the paracellular space between epithelial or endothelial cell sheets, and also play critical roles in maintaining cell polarity and signal transductions. Differential expression of this gene has been observed in different types of malignancies, including breast cancer, ovarian cancer, hepatocellular carcinomas, urinary tumors, prostate cancer, lung cancer, head and neck cancers, thyroid carcinomas, etc.. Alternatively spliced transcript variants encoding different isoforms have been found. | NA |
| EPCAM | ENSG00000119888 | epithelial cell adhesion molecule | 4072 | This gene encodes a carcinoma-associated antigen and is a member of a family that includes at least two type I membrane proteins. This antigen is expressed on most normal epithelial cells and gastrointestinal carcinomas and functions as a homotypic calcium-independent cell adhesion molecule. The antigen is being used as a target for immunotherapy treatment of human carcinomas. Mutations in this gene result in congenital tufting enteropathy. | NA |
| SULT1A2 | ENSG00000197165 | sulfotransferase family 1A member 2 | 6799 | Sulfotransferase enzymes catalyze the sulfate conjugation of many hormones, neurotransmitters, drugs, and xenobiotic compounds. These cytosolic enzymes are different in their tissue distributions and substrate specificities. The gene structure (number and length of exons) is similar among family members. This gene encodes one of two phenol sulfotransferases with thermostable enzyme activity. Two alternatively spliced variants that encode the same protein have been described. | NA |
| P2RX1 | ENSG00000108405 | purinergic receptor P2X 1 | 5023 | The protein encoded by this gene belongs to the P2X family of G-protein-coupled receptors. These proteins can form homo-and heterotimers and function as ATP-gated ion channels and mediate rapid and selective permeability to cations. This protein is primarily localized to smooth muscle where binds ATP and mediates synaptic transmission between neurons and from neurons to smooth muscle and may being responsible for sympathetic vasoconstriction in small arteries, arterioles and vas deferens. Mouse studies suggest that this receptor is essential for normal male reproductive function. This protein may also be involved in promoting apoptosis. | NA |
| CD79A | ENSG00000105369 | CD79a molecule | 973 | The B lymphocyte antigen receptor is a multimeric complex that includes the antigen-specific component, surface immunoglobulin (Ig). Surface Ig non-covalently associates with two other proteins, Ig-alpha and Ig-beta, which are necessary for expression and function of the B-cell antigen receptor. This gene encodes the Ig-alpha protein of the B-cell antigen component. Alternatively spliced transcript variants encoding different isoforms have been described. | NA |
| SPINK1 | ENSG00000164266 | serine peptidase inhibitor, Kazal type 1 | 6690 | The protein encoded by this gene is a trypsin inhibitor, which is secreted from pancreatic acinar cells into pancreatic juice. It is thought to function in the prevention of trypsin-catalyzed premature activation of zymogens within the pancreas and the pancreatic duct. Mutations in this gene are associated with hereditary pancreatitis and tropical calcific pancreatitis. | NA |
| SFN | ENSG00000175793 | stratifin | 2810 | NA | NA |
| IL1B | ENSG00000125538 | interleukin 1 beta | 3553 | The protein encoded by this gene is a member of the interleukin 1 cytokine family. This cytokine is produced by activated macrophages as a proprotein, which is proteolytically processed to its active form by caspase 1 (CASP1/ICE). This cytokine is an important mediator of the inflammatory response, and is involved in a variety of cellular activities, including cell proliferation, differentiation, and apoptosis. The induction of cyclooxygenase-2 (PTGS2/COX2) by this cytokine in the central nervous system (CNS) is found to contribute to inflammatory pain hypersensitivity. This gene and eight other interleukin 1 family genes form a cytokine gene cluster on chromosome 2. | NA |
| LYZ | ENSG00000090382 | lysozyme | 4069 | This gene encodes human lysozyme, whose natural substrate is the bacterial cell wall peptidoglycan (cleaving the beta[1-4]glycosidic linkages between N-acetylmuramic acid and N-acetylglucosamine). Lysozyme is one of the antimicrobial agents found in human milk, and is also present in spleen, lung, kidney, white blood cells, plasma, saliva, and tears. The protein has antibacterial activity against a number of bacterial species. Missense mutations in this gene have been identified in heritable renal amyloidosis. | NA |
| STXBP2 | ENSG00000076944 | syntaxin binding protein 2 | 6813 | This gene encodes a member of the STXBP/unc-18/SEC1 family. The encoded protein is involved in intracellular trafficking, control of SNARE (soluble NSF attachment protein receptor) complex assembly, and the release of cytotoxic granules by natural killer cells. Mutations in this gene are associated with familial hemophagocytic lymphohistiocytosis. Alternatively spliced transcript variants encoding different isoforms have been noted for this gene. | NA |
| MATK | ENSG00000007264 | megakaryocyte-associated tyrosine kinase | 4145 | The protein encoded by this gene has amino acid sequence similarity to Csk tyrosine kinase and has the structural features of the CSK subfamily: SRC homology SH2 and SH3 domains, a catalytic domain, a unique N terminus, lack of myristylation signals, lack of a negative regulatory phosphorylation site, and lack of an autophosphorylation site. This protein is thought to play a significant role in the signal transduction of hematopoietic cells. It is able to phosphorylate and inactivate Src family kinases, and may play an inhibitory role in the control of T-cell proliferation. This protein might be involved in signaling in some cases of breast cancer. Three alternatively spliced transcript variants that encode different isoforms have been described for this gene. | NA |
| LTK | ENSG00000062524 | leukocyte receptor tyrosine kinase | 4058 | The protein encoded by this gene is a member of the ros/insulin receptor family of tyrosine kinases. Tyrosine-specific phosphorylation of proteins is a key to the control of diverse pathways leading to cell growth and differentiation. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| TSPAN1 | ENSG00000117472 | tetraspanin 1 | 10103 | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. | NA |
| OASL | ENSG00000135114 | 2’-5’-oligoadenylate synthetase like | 8638 | NA | NA |
| IGHG4 | ENSG00000211892 | immunoglobulin heavy constant gamma 4 (G4m marker) | ENSG00000211892 | NA | NA |
| ITGAX | ENSG00000140678 | integrin subunit alpha X | 3687 | This gene encodes the integrin alpha X chain protein. Integrins are heterodimeric integral membrane proteins composed of an alpha chain and a beta chain. This protein combines with the beta 2 chain (ITGB2) to form a leukocyte-specific integrin referred to as inactivated-C3b (iC3b) receptor 4 (CR4). The alpha X beta 2 complex seems to overlap the properties of the alpha M beta 2 integrin in the adherence of neutrophils and monocytes to stimulated endothelium cells, and in the phagocytosis of complement coated particles. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| ARHGAP45 | ENSG00000180448 | Rho GTPase activating protein 45 | 23526 | NA | NA |
| CMPK2 | ENSG00000134326 | cytidine/uridine monophosphate kinase 2 | 129607 | This gene encodes one of the enzymes in the nucleotide synthesis salvage pathway that may participate in terminal differentiation of monocytic cells. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| ST14 | ENSG00000149418 | suppression of tumorigenicity 14 | 6768 | The protein encoded by this gene is an epithelial-derived, integral membrane serine protease. This protease forms a complex with the Kunitz-type serine protease inhibitor, HAI-1, and is found to be activated by sphingosine 1-phosphate. This protease has been shown to cleave and activate hepatocyte growth factor/scattering factor, and urokinase plasminogen activator, which suggest the function of this protease as an epithelial membrane activator for other proteases and latent growth factors. The expression of this protease has been associated with breast, colon, prostate, and ovarian tumors, which implicates its role in cancer invasion, and metastasis. | NA |
| TPD52 | ENSG00000076554 | tumor protein D52 | 7163 | NA | NA |
| NA | ENSG00000161570 | NA | NA | NA | TRUE |
| RPS6KA1 | ENSG00000117676 | ribosomal protein S6 kinase A1 | 6195 | This gene encodes a member of the RSK (ribosomal S6 kinase) family of serine/threonine kinases. This kinase contains 2 nonidentical kinase catalytic domains and phosphorylates various substrates, including members of the mitogen-activated kinase (MAPK) signalling pathway. The activity of this protein has been implicated in controlling cell growth and differentiation. Alternate transcriptional splice variants, encoding different isoforms, have been characterized. | NA |
| CARD11 | ENSG00000198286 | caspase recruitment domain family member 11 | 84433 | The protein encoded by this gene belongs to the membrane-associated guanylate kinase (MAGUK) family, a class of proteins that functions as molecular scaffolds for the assembly of multiprotein complexes at specialized regions of the plasma membrane. This protein is also a member of the CARD protein family, which is defined by carrying a characteristic caspase-associated recruitment domain (CARD). This protein has a domain structure similar to that of CARD14 protein. The CARD domains of both proteins have been shown to specifically interact with BCL10, a protein known to function as a positive regulator of cell apoptosis and NF-kappaB activation. When expressed in cells, this protein activated NF-kappaB and induced the phosphorylation of BCL10. | NA |
| PDIA2 | ENSG00000185615 | protein disulfide isomerase family A member 2 | 64714 | Protein disulfide isomerases (EC 5.3.4.1), such as PDIP, are endoplasmic reticulum (ER) resident proteins that catalyze protein folding and thiol-disulfide interchange reactions (Desilva et al., 1996 [PubMed 8561901]). | NA |
| SPINT1 | ENSG00000166145 | serine peptidase inhibitor, Kunitz type 1 | 6692 | The protein encoded by this gene is a member of the Kunitz family of serine protease inhibitors. The protein is a potent inhibitor specific for HGF activator and is thought to be involved in the regulation of the proteolytic activation of HGF in injured tissues. Alternative splicing results in multiple variants encoding different isoforms. | NA |
| CEACAM1 | ENSG00000079385 | carcinoembryonic antigen related cell adhesion molecule 1 | 634 | This gene encodes a member of the carcinoembryonic antigen (CEA) gene family, which belongs to the immunoglobulin superfamily. Two subgroups of the CEA family, the CEA cell adhesion molecules and the pregnancy-specific glycoproteins, are located within a 1.2 Mb cluster on the long arm of chromosome 19. Eleven pseudogenes of the CEA cell adhesion molecule subgroup are also found in the cluster. The encoded protein was originally described in bile ducts of liver as biliary glycoprotein. Subsequently, it was found to be a cell-cell adhesion molecule detected on leukocytes, epithelia, and endothelia. The encoded protein mediates cell adhesion via homophilic as well as heterophilic binding to other proteins of the subgroup. Multiple cellular activities have been attributed to the encoded protein, including roles in the differentiation and arrangement of tissue three-dimensional structure, angiogenesis, apoptosis, tumor suppression, metastasis, and the modulation of innate and adaptive immune responses. Multiple transcript variants encoding different isoforms have been reported, but the full-length nature of all variants has not been defined. | NA |
| RBM47 | ENSG00000163694 | RNA binding motif protein 47 | 54502 | NA | NA |
| GALE | ENSG00000117308 | UDP-galactose-4-epimerase | 2582 | This gene encodes UDP-galactose-4-epimerase which catalyzes two distinct but analogous reactions: the epimerization of UDP-glucose to UDP-galactose, and the epimerization of UDP-N-acetylglucosamine to UDP-N-acetylgalactosamine. The bifunctional nature of the enzyme has the important metabolic consequence that mutant cells (or individuals) are dependent not only on exogenous galactose, but also on exogenous N-acetylgalactosamine as a necessary precursor for the synthesis of glycoproteins and glycolipids. Mutations in this gene result in epimerase-deficiency galactosemia, also referred to as galactosemia type 3, a disease characterized by liver damage, early-onset cataracts, deafness and mental retardation, with symptoms ranging from mild (‘peripheral’ form) to severe (‘generalized’ form). Multiple alternatively spliced transcripts encoding the same protein have been identified. | NA |
| RP11-703H8.7 | ENSG00000255118 | NA | ENSG00000255118 | NA | NA |
| KRT10 | ENSG00000186395 | keratin 10 | 3858 | This gene encodes a member of the type I (acidic) cytokeratin family, which belongs to the superfamily of intermediate filament (IF) proteins. Keratins are heteropolymeric structural proteins which form the intermediate filament. These filaments, along with actin microfilaments and microtubules, compose the cytoskeleton of epithelial cells. Mutations in this gene are associated with epidermolytic hyperkeratosis. This gene is located within a cluster of keratin family members on chromosome 17q21. | NA |
| TSPAN13 | ENSG00000106537 | tetraspanin 13 | 27075 | The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. | NA |
| TOX | ENSG00000198846 | thymocyte selection associated high mobility group box | 9760 | The protein encoded by this gene contains a HMG box DNA binding domain. HMG boxes are found in many eukaryotic proteins involved in chromatin assembly, transcription and replication. This protein may function to regulate T-cell development. | NA |
| IL18RAP | ENSG00000115607 | interleukin 18 receptor accessory protein | 8807 | The protein encoded by this gene is an accessory subunit of the heterodimeric receptor for interleukin 18 (IL18), a proinflammatory cytokine involved in inducing cell-mediated immunity. This protein enhances the IL18-binding activity of the IL18 receptor and plays a role in signaling by IL18. Mutations in this gene are associated with Crohn’s disease and inflammatory bowel disease, and susceptibility to celiac disease and leprosy. Alternatively spliced transcript variants of this gene have been described, but their full-length nature is not known. | NA |
| CNN1 | ENSG00000130176 | calponin 1 | 1264 | NA | NA |
| NLRP3 | ENSG00000162711 | NLR family pyrin domain containing 3 | 114548 | This gene encodes a pyrin-like protein containing a pyrin domain, a nucleotide-binding site (NBS) domain, and a leucine-rich repeat (LRR) motif. This protein interacts with the apoptosis-associated speck-like protein PYCARD/ASC, which contains a caspase recruitment domain, and is a member of the NALP3 inflammasome complex. This complex functions as an upstream activator of NF-kappaB signaling, and it plays a role in the regulation of inflammation, the immune response, and apoptosis. Mutations in this gene are associated with familial cold autoinflammatory syndrome (FCAS), Muckle-Wells syndrome (MWS), chronic infantile neurological cutaneous and articular (CINCA) syndrome, and neonatal-onset multisystem inflammatory disease (NOMID). Multiple alternatively spliced transcript variants encoding distinct isoforms have been identified for this gene. Alternative 5’ UTR structures are suggested by available data; however, insufficient evidence is available to determine if all of the represented 5’ UTR splice patterns are biologically valid. | NA |
| SLA | ENSG00000155926 | Src-like-adaptor | 6503 | NA | NA |
| LTB | ENSG00000227507 | lymphotoxin beta | 4050 | Lymphotoxin beta is a type II membrane protein of the TNF family. It anchors lymphotoxin-alpha to the cell surface through heterotrimer formation. The predominant form on the lymphocyte surface is the lymphotoxin-alpha 1/beta 2 complex (e.g. 1 molecule alpha/2 molecules beta) and this complex is the primary ligand for the lymphotoxin-beta receptor. The minor complex is lymphotoxin-alpha 2/beta 1. LTB is an inducer of the inflammatory response system and involved in normal development of lymphoid tissue. Lymphotoxin-beta isoform b is unable to complex with lymphotoxin-alpha suggesting a function for lymphotoxin-beta which is independent of lympyhotoxin-alpha. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| ALB | ENSG00000163631 | albumin | 213 | Albumin is a soluble, monomeric protein which comprises about one-half of the blood serum protein. Albumin functions primarily as a carrier protein for steroids, fatty acids, and thyroid hormones and plays a role in stabilizing extracellular fluid volume. Albumin is a globular unglycosylated serum protein of molecular weight 65,000. Albumin is synthesized in the liver as preproalbumin which has an N-terminal peptide that is removed before the nascent protein is released from the rough endoplasmic reticulum. The product, proalbumin, is in turn cleaved in the Golgi vesicles to produce the secreted albumin. | NA |
| TREM2 | ENSG00000095970 | triggering receptor expressed on myeloid cells 2 | 54209 | This gene encodes a membrane protein that forms a receptor signaling complex with the TYRO protein tyrosine kinase binding protein. The encoded protein functions in immune response and may be involved in chronic inflammation by triggering the production of constitutive inflammatory cytokines. Defects in this gene are a cause of polycystic lipomembranous osteodysplasia with sclerosing leukoencephalopathy (PLOSL). Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| SOX9 | ENSG00000125398 | SRY-box 9 | 6662 | The protein encoded by this gene recognizes the sequence CCTTGAG along with other members of the HMG-box class DNA-binding proteins. It acts during chondrocyte differentiation and, with steroidogenic factor 1, regulates transcription of the anti-Muellerian hormone (AMH) gene. Deficiencies lead to the skeletal malformation syndrome campomelic dysplasia, frequently with sex reversal. | NA |
| GPX2 | ENSG00000176153 | glutathione peroxidase 2 | 2877 | This gene is a member of the glutathione peroxidase family and encodes a selenium-dependent glutathione peroxidase that is one of two isoenzymes responsible for the majority of the glutathione-dependent hydrogen peroxide-reducing activity in the epithelium of the gastrointestinal tract. The protein encoded by this locus contains a selenocysteine (Sec) residue encoded by the UGA codon, which normally signals translation termination. Alternatively spliced transcript variants have been described. | NA |
| SFTPA1 | ENSG00000122852 | surfactant protein A1 | 653509 | This gene encodes a lung surfactant protein that is a member of a subfamily of C-type lectins called collectins. The encoded protein binds specific carbohydrate moieties found on lipids and on the surface of microorganisms. This protein plays an essential role in surfactant homeostasis and in the defense against respiratory pathogens. Mutations in this gene are associated with idiopathic pulmonary fibrosis. Alternate splicing results in multiple transcript variants. | NA |
| LSR | ENSG00000105699 | lipolysis stimulated lipoprotein receptor | 51599 | NA | NA |
| PCED1B-AS1 | ENSG00000247774 | PCED1B antisense RNA 1 | 100233209 | NA | NA |
| SLC7A7 | ENSG00000155465 | solute carrier family 7 member 7 | 9056 | The protein encoded by this gene is the light subunit of a cationic amino acid transporter. This sodium-independent transporter is formed when the light subunit encoded by this gene dimerizes with the heavy subunit transporter protein SLC3A2. This transporter is found in epithelial cell membranes where it transfers cationic and large neutral amino acids from the cell to the extracellular space. Defects in this gene are a cause of lysinuric protein intolerance (LPI). Alternative splicing results in multiple transcript variants. | NA |
| SPOCK2 | ENSG00000107742 | sparc/osteonectin, cwcv and kazal-like domains proteoglycan (testican) 2 | 9806 | This gene encodes a protein which binds with glycosaminoglycans to form part of the extracellular matrix. The protein contains thyroglobulin type-1, follistatin-like, and calcium-binding domains, and has glycosaminoglycan attachment sites in the acidic C-terminal region. Three alternatively spliced transcript variants that encode different protein isoforms have been described for this gene. | NA |
| CD52 | ENSG00000169442 | CD52 molecule | 1043 | NA | NA |
| APOH | ENSG00000091583 | apolipoprotein H | 350 | Apolipoprotein H has been implicated in a variety of physiologic pathways including lipoprotein metabolism, coagulation, and the production of antiphospholipid autoantibodies. APOH may be a required cofactor for anionic phospholipid binding by the antiphospholipid autoantibodies found in sera of many patients with lupus and primary antiphospholipid syndrome, but it does not seem to be required for the reactivity of antiphospholipid autoantibodies associated with infections. | NA |
| RP11-324O2.3 | ENSG00000232934 | NA | ENSG00000232934 | NA | NA |
| ABRACL | ENSG00000146386 | ABRA C-terminal like | 58527 | NA | NA |
| RAP1GAP | ENSG00000076864 | RAP1 GTPase activating protein | 5909 | This gene encodes a type of GTPase-activating-protein (GAP) that down-regulates the activity of the ras-related RAP1 protein. RAP1 acts as a molecular switch by cycling between an inactive GDP-bound form and an active GTP-bound form. The product of this gene, RAP1GAP, promotes the hydrolysis of bound GTP and hence returns RAP1 to the inactive state whereas other proteins, guanine nucleotide exchange factors (GEFs), act as RAP1 activators by facilitating the conversion of RAP1 from the GDP- to the GTP-bound form. In general, ras subfamily proteins, such as RAP1, play key roles in receptor-linked signaling pathways that control cell growth and differentiation. RAP1 plays a role in diverse processes such as cell proliferation, adhesion, differentiation, and embryogenesis. Alternative splicing results in multiple transcript variants encoding distinct proteins. | NA |
| TNFRSF11A | ENSG00000141655 | tumor necrosis factor receptor superfamily member 11a | 8792 | The protein encoded by this gene is a member of the TNF-receptor superfamily. This receptors can interact with various TRAF family proteins, through which this receptor induces the activation of NF-kappa B and MAPK8/JNK. This receptor and its ligand are important regulators of the interaction between T cells and dendritic cells. This receptor is also an essential mediator for osteoclast and lymph node development. Mutations at this locus have been associated with familial expansile osteolysis, autosomal recessive osteopetrosis, and Paget disease of bone. Alternatively spliced transcript variants have been described for this locus. | NA |
| RP11-1143G9.4 | ENSG00000257764 | NA | ENSG00000257764 | NA | NA |
| PCP4 | ENSG00000183036 | Purkinje cell protein 4 | 5121 | NA | NA |
| FUT2 | ENSG00000176920 | fucosyltransferase 2 | 2524 | The protein encoded by this gene is a Golgi stack membrane protein that is involved in the creation of a precursor of the H antigen, which is required for the final step in the soluble A and B antigen synthesis pathway. This gene is one of two encoding the galactoside 2-L-fucosyltransferase enzyme. Two transcript variants encoding the same protein have been found for this gene. | NA |
| CORO2A | ENSG00000106789 | coronin 2A | 7464 | This gene encodes a member of the WD repeat protein family. WD repeats are minimally conserved regions of approximately 40 amino acids typically bracketed by gly-his and trp-asp (GH-WD), which may facilitate formation of heterotrimeric or multiprotein complexes. Members of this family are involved in a variety of cellular processes, including cell cycle progression, signal transduction, apoptosis, and gene regulation. This protein contains 5 WD repeats, and has a structural similarity with actin-binding proteins: the D. discoideum coronin and the human p57 protein, suggesting that this protein may also be an actin-binding protein that regulates cell motility. Alternative splicing of this gene generates 2 transcript variants. | NA |
| CTB-191K22.5 | ENSG00000267815 | NA | ENSG00000267815 | NA | NA |
| AKR7A3 | ENSG00000162482 | aldo-keto reductase family 7 member A3 | 22977 | Aldo-keto reductases, such as AKR7A3, are involved in the detoxification of aldehydes and ketones. | NA |
| KBTBD8 | ENSG00000163376 | kelch repeat and BTB domain containing 8 | 84541 | NA | NA |
| CD4 | ENSG00000010610 | CD4 molecule | 920 | This gene encodes a membrane glycoprotein of T lymphocytes that interacts with major histocompatibility complex class II antigenes and is also a receptor for the human immunodeficiency virus. This gene is expressed not only in T lymphocytes, but also in B cells, macrophages, and granulocytes. It is also expressed in specific regions of the brain. The protein functions to initiate or augment the early phase of T-cell activation, and may function as an important mediator of indirect neuronal damage in infectious and immune-mediated diseases of the central nervous system. Multiple alternatively spliced transcript variants encoding different isoforms have been identified in this gene. | NA |
| CYBA | ENSG00000051523 | cytochrome b-245 alpha chain | 1535 | Cytochrome b is comprised of a light chain (alpha) and a heavy chain (beta). This gene encodes the light, alpha subunit which has been proposed as a primary component of the microbicidal oxidase system of phagocytes. Mutations in this gene are associated with autosomal recessive chronic granulomatous disease (CGD), that is characterized by the failure of activated phagocytes to generate superoxide, which is important for the microbicidal activity of these cells. | NA |
| BCL2L15 | ENSG00000188761 | BCL2 like 15 | 440603 | NA | NA |
| LAPTM5 | ENSG00000162511 | lysosomal protein transmembrane 5 | 7805 | This gene encodes a transmembrane receptor that is associated with lysosomes. The encoded protein, also known as E3 protein, may play a role in hematopoiesis. | NA |
| KCNAB2 | ENSG00000069424 | potassium voltage-gated channel subfamily A regulatory beta subunit 2 | 8514 | Voltage-gated potassium (Kv) channels represent the most complex class of voltage-gated ion channels from both functional and structural standpoints. Their diverse functions include regulating neurotransmitter release, heart rate, insulin secretion, neuronal excitability, epithelial electrolyte transport, smooth muscle contraction, and cell volume. Four sequence-related potassium channel genes - shaker, shaw, shab, and shal - have been identified in Drosophila, and each has been shown to have human homolog(s). This gene encodes a member of the potassium channel, voltage-gated, shaker-related subfamily. This member is one of the beta subunits, which are auxiliary proteins associating with functional Kv-alpha subunits. This member alters functional properties of the KCNA4 gene product. Alternative splicing of this gene results in multiple transcript variants encoding distinct isoforms. | NA |
| ACVRL1 | ENSG00000139567 | activin A receptor like type 1 | 94 | This gene encodes a type I cell-surface receptor for the TGF-beta superfamily of ligands. It shares with other type I receptors a high degree of similarity in serine-threonine kinase subdomains, a glycine- and serine-rich region (called the GS domain) preceding the kinase domain, and a short C-terminal tail. The encoded protein, sometimes termed ALK1, shares similar domain structures with other closely related ALK or activin receptor-like kinase proteins that form a subfamily of receptor serine/threonine kinases. Mutations in this gene are associated with hemorrhagic telangiectasia type 2, also known as Rendu-Osler-Weber syndrome 2. | NA |
| KRT7 | ENSG00000135480 | keratin 7 | 3855 | The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is specifically expressed in the simple epithelia lining the cavities of the internal organs and in the gland ducts and blood vessels. The genes encoding the type II cytokeratins are clustered in a region of chromosome 12q12-q13. Alternative splicing may result in several transcript variants; however, not all variants have been fully described. | NA |
| SELL | ENSG00000188404 | selectin L | 6402 | This gene encodes a cell surface adhesion molecule that belongs to a family of adhesion/homing receptors. The encoded protein contains a C-type lectin-like domain, a calcium-binding epidermal growth factor-like domain, and two short complement-like repeats. The gene product is required for binding and subsequent rolling of leucocytes on endothelial cells, facilitating their migration into secondary lymphoid organs and inflammation sites. Single-nucleotide polymorphisms in this gene have been associated with various diseases including immunoglobulin A nephropathy. Alternatively spliced transcript variants have been found for this gene. | NA |
| RP11-867G23.8 | ENSG00000255468 | NA | ENSG00000255468 | NA | NA |
| DOK3 | ENSG00000146094 | docking protein 3 | 79930 | NA | NA |
| SNX10 | ENSG00000086300 | sorting nexin 10 | 29887 | This gene encodes a member of the sorting nexin family. Members of this family contain a phox (PX) domain, which is a phosphoinositide binding domain, and are involved in intracellular trafficking. This protein does not contain a coiled coil region, like some family members. This gene may play a role in regulating endosome homeostasis. Alternative splicing results in multiple transcript variants. | NA |
| FYB | ENSG00000082074 | FYN binding protein | 2533 | The protein encoded by this gene is an adapter for the FYN protein and LCP2 signaling cascades in T-cells. The encoded protein is involved in platelet activation and controls the expression of interleukin-2. Three transcript variants encoding different isoforms have been found for this gene. | NA |
| CTD-2020K17.4 | ENSG00000233483 | NA | ENSG00000233483 | NA | NA |
| CALML5 | ENSG00000178372 | calmodulin like 5 | 51806 | This gene encodes a novel calcium binding protein expressed in the epidermis and related to the calmodulin family of calcium binding proteins. Functional studies with recombinant protein demonstrate it does bind calcium and undergoes a conformational change when it does so. Abundant expression is detected only in reconstructed epidermis and is restricted to differentiating keratinocytes. In addition, it can associate with transglutaminase 3, shown to be a key enzyme in the terminal differentiation of keratinocytes. | NA |
| TBX15 | ENSG00000092607 | T-box 15 | 6913 | This gene belongs to the T-box family of genes, which encode a phylogenetically conserved family of transcription factors that regulate a variety of developmental processes. All these genes contain a common T-box DNA-binding domain. Mutations in this gene are associated with Cousin syndrome. | NA |
| HAVCR2 | ENSG00000135077 | hepatitis A virus cellular receptor 2 | 84868 | The protein encoded by this gene belongs to the immunoglobulin superfamily, and TIM family of proteins. CD4-positive T helper lymphocytes can be divided into types 1 (Th1) and 2 (Th2) on the basis of their cytokine secretion patterns. Th1 cells are involved in cell-mediated immunity to intracellular pathogens and delayed-type hypersensitivity reactions, whereas, Th2 cells are involved in the control of extracellular helminthic infections and the promotion of atopic and allergic diseases. This protein is a Th1-specific cell surface protein that regulates macrophage activation, and inhibits Th1-mediated auto- and alloimmune responses, and promotes immunological tolerance. | NA |
| SFTPA2 | ENSG00000185303 | surfactant protein A2 | 729238 | This gene is one of several genes encoding pulmonary-surfactant associated proteins (SFTPA) located on chromosome 10. Mutations in this gene and a highly similar gene located nearby, which affect the highly conserved carbohydrate recognition domain, are associated with idiopathic pulmonary fibrosis. The current version of the assembly displays only a single centromeric SFTPA gene pair rather than the two gene pairs shown in the previous assembly which were thought to have resulted from a duplication. | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",16,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[17,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| name | X_id | query | symbol | summary | notfound |
|---|---|---|---|---|---|
| ras homolog family member V | 171177 | ENSG00000104140 | RHOV | NA | NA |
| serum amyloid A1 | 6288 | ENSG00000173432 | SAA1 | This gene encodes a member of the serum amyloid A family of apolipoproteins. The encoded preproprotein is proteolytically processed to generate the mature protein. This protein is a major acute phase protein that is highly expressed in response to inflammation and tissue injury. This protein also plays an important role in HDL metabolism and cholesterol homeostasis. High levels of this protein are associated with chronic inflammatory diseases including atherosclerosis, rheumatoid arthritis, Alzheimer’s disease and Crohn’s disease. This protein may also be a potential biomarker for certain tumors. Alternate splicing results in multiple transcript variants that encode the same protein. A pseudogene of this gene is found on chromosome 11. | NA |
| CD200 molecule | 4345 | ENSG00000091972 | CD200 | This gene encodes a type I membrane glycoprotein containing two extracellular immunoglobulin domains, a transmembrane and a cytoplasmic domain. This gene is expressed by various cell types, including B cells, a subset of T cells, thymocytes, endothelial cells, and neurons. The encoded protein plays an important role in immunosuppression and regulation of anti-tumor activity. Alternative splicing results in multiple transcript variants encoding different isoforms. | NA |
| ankyrin repeat domain 22 | 118932 | ENSG00000152766 | ANKRD22 | NA | NA |
| leucine rich repeat containing 1 | 55227 | ENSG00000137269 | LRRC1 | NA | NA |
| NA | ENSG00000271218 | ENSG00000271218 | RP3-523E19.2 | NA | NA |
| syntaxin 19 | 415117 | ENSG00000178750 | STX19 | NA | NA |
| delta(4)-desaturase, sphingolipid 2 | 123099 | ENSG00000168350 | DEGS2 | This gene encodes a bifunctional enzyme that is involved in the biosynthesis of phytosphingolipids in human skin and in other phytosphingolipid-containing tissues. This enzyme can act as a sphingolipid delta(4)-desaturase, and also as a sphingolipid C4-hydroxylase. | NA |
| NA | ENSG00000271133 | ENSG00000271133 | CTA-293F17.1 | NA | NA |
| synaptotagmin like 1 | 84958 | ENSG00000142765 | SYTL1 | NA | NA |
| 4-hydroxyphenylpyruvate dioxygenase | 3242 | ENSG00000158104 | HPD | The protein encoded by this gene is an enzyme in the catabolic pathway of tyrosine. The encoded protein catalyzes the conversion of 4-hydroxyphenylpyruvate to homogentisate. Defects in this gene are a cause of tyrosinemia type 3 (TYRO3) and hawkinsinuria (HAWK). Two transcript variants encoding different isoforms have been found for this gene. | NA |
| SPARC like 1 | 8404 | ENSG00000152583 | SPARCL1 | NA | NA |
| chromosome 15 open reading frame 48 | 84419 | ENSG00000166920 | C15orf48 | This gene was first identified in a study of human esophageal squamous cell carcinoma tissues. Levels of both the message and protein are reduced in carcinoma samples. In adult human tissues, this gene is expressed in the the esophagus, stomach, small intestine, colon and placenta. Alternatively spliced transcript variants that encode the same protein have been identified. | NA |
| macrophage stimulating 1 receptor | 4486 | ENSG00000164078 | MST1R | This gene encodes a cell surface receptor for macrophage-stimulating protein (MSP) with tyrosine kinase activity. The mature form of this protein is a heterodimer of disulfide-linked alpha and beta subunits, generated by proteolytic cleavage of a single-chain precursor. The beta subunit undergoes tyrosine phosphorylation upon stimulation by MSP. This protein is expressed on the ciliated epithelia of the mucociliary transport apparatus of the lung, and together with MSP, thought to be involved in host defense. Alternative splicing generates multiple transcript variants encoding different isoforms that may undergo similar proteolytic processing. | NA |
| calponin 1 | 1264 | ENSG00000130176 | CNN1 | NA | NA |
| junctophilin 1 | 56704 | ENSG00000104369 | JPH1 | Junctional complexes between the plasma membrane and endoplasmic/sarcoplasmic reticulum are a common feature of all excitable cell types and mediate cross talk between cell surface and intracellular ion channels. The protein encoded by this gene is a component of junctional complexes and is composed of a C-terminal hydrophobic segment spanning the endoplasmic/sarcoplasmic reticulum membrane and a remaining cytoplasmic domain that shows specific affinity for the plasma membrane. This gene is a member of the junctophilin gene family. | NA |
| NA | NA | ENSG00000205246 | NA | NA | TRUE |
| NOTCH-regulated ankyrin repeat protein | 441478 | ENSG00000198435 | NRARP | NA | NA |
| N-acetylglutamate synthase | 162417 | ENSG00000161653 | NAGS | The N-acetylglutamate synthase gene encodes a mitochondrial enzyme that catalyzes the formation of N-acetylglutamate (NAG) from glutamate and acetyl coenzyme-A. NAG is a cofactor of carbamyl phosphate synthetase I (CPSI), the first enzyme of the urea cycle in mammals. This gene may regulate ureagenesis by altering NAG availability and, thereby, CPSI activity. Deficiencies in N-acetylglutamate synthase have been associated with hyperammonemia. | NA |
| apolipoprotein C1 | 341 | ENSG00000130208 | APOC1 | This gene encodes a member of the apolipoprotein C1 family. This gene is expressed primarily in the liver, and it is activated when monocytes differentiate into macrophages. The encoded protein plays a central role in high density lipoprotein (HDL) and very low density lipoprotein (VLDL) metabolism. This protein has also been shown to inhibit cholesteryl ester transfer protein in plasma. A pseudogene of this gene is located 4 kb downstream in the same orientation, on the same chromosome. This gene is mapped to chromosome 19, where it resides within a apolipoprotein gene cluster. | NA |
| formimidoyltransferase cyclodeaminase | 10841 | ENSG00000160282 | FTCD | The protein encoded by this gene is a bifunctional enzyme that channels 1-carbon units from formiminoglutamate, a metabolite of the histidine degradation pathway, to the folate pool. Mutations in this gene are associated with glutamate formiminotransferase deficiency. Alternatively spliced transcript variants have been found for this gene. | NA |
| fatty acyl-CoA reductase 1 | 84188 | ENSG00000197601 | FAR1 | The protein encoded by this gene is required for the reduction of fatty acids to fatty alcohols, a process that is required for the synthesis of monoesters and ether lipids. NADPH is required as a cofactor in this reaction, and 16-18 carbon saturated and unsaturated fatty acids are the preferred substrate. This is a peroxisomal membrane protein, and studies suggest that the N-terminus contains a large catalytic domain located on the outside of the peroxisome, while the C-terminus is exposed to the matrix of the peroxisome. Studies indicate that the regulation of this protein is dependent on plasmalogen levels. Mutations in this gene have been associated with individuals affected by severe intellectual disability, early-onset epilepsy, microcephaly, congenital cataracts, growth retardation, and spasticity (PMID: 25439727). A pseudogene of this gene is located on chromosome 13. | NA |
| complement C1r subcomponent | 715 | ENSG00000159403 | C1R | NA | NA |
| calsequestrin 2 | 845 | ENSG00000118729 | CASQ2 | The protein encoded by this gene specifies the cardiac muscle family member of the calsequestrin family. Calsequestrin is localized to the sarcoplasmic reticulum in cardiac and slow skeletal muscle cells. The protein is a calcium binding protein that stores calcium for muscle function. Mutations in this gene cause stress-induced polymorphic ventricular tachycardia, also referred to as catecholaminergic polymorphic ventricular tachycardia 2 (CPVT2), a disease characterized by bidirectional ventricular tachycardia that may lead to cardiac arrest. | NA |
| myelin protein zero like 2 | 10205 | ENSG00000149573 | MPZL2 | Thymus development depends on a complex series of interactions between thymocytes and the stromal component of the organ. Epithelial V-like antigen (EVA) is expressed in thymus epithelium and strongly downregulated by thymocyte developmental progression. This gene is expressed in the thymus and in several epithelial structures early in embryogenesis. It is highly homologous to the myelin protein zero and, in thymus-derived epithelial cell lines, is poorly soluble in nonionic detergents, strongly suggesting an association to the cytoskeleton. Its capacity to mediate cell adhesion through a homophilic interaction and its selective regulation by T cell maturation might imply the participation of EVA in the earliest phases of thymus organogenesis. The protein bears a characteristic V-type domain and two potential N-glycosylation sites in the extracellular domain; a putative serine phosphorylation site for casein kinase 2 is also present in the cytoplasmic tail. Two transcript variants encoding the same protein have been found for this gene. | NA |
| alpha tocopherol transfer protein like | 79183 | ENSG00000124120 | TTPAL | NA | NA |
| KIT proto-oncogene receptor tyrosine kinase | 3815 | ENSG00000157404 | KIT | This gene encodes the human homolog of the proto-oncogene c-kit. C-kit was first identified as the cellular homolog of the feline sarcoma viral oncogene v-kit. This protein is a type 3 transmembrane receptor for MGF (mast cell growth factor, also known as stem cell factor). Mutations in this gene are associated with gastrointestinal stromal tumors, mast cell disease, acute myelogenous lukemia, and piebaldism. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| netrin 1 | 9423 | ENSG00000065320 | NTN1 | Netrin is included in a family of laminin-related secreted proteins. The function of this gene has not yet been defined; however, netrin is thought to be involved in axon guidance and cell migration during development. Mutations and loss of expression of netrin suggest that variation in netrin may be involved in cancer development. | NA |
| serum amyloid A2 | 6289 | ENSG00000134339 | SAA2 | NA | NA |
| orosomucoid 1 | 5004 | ENSG00000229314 | ORM1 | This gene encodes a key acute phase plasma protein. Because of its increase due to acute inflammation, this protein is classified as an acute-phase reactant. The specific function of this protein has not yet been determined; however, it may be involved in aspects of immunosuppression. | NA |
| serine peptidase inhibitor, Kunitz type 1 | 6692 | ENSG00000166145 | SPINT1 | The protein encoded by this gene is a member of the Kunitz family of serine protease inhibitors. The protein is a potent inhibitor specific for HGF activator and is thought to be involved in the regulation of the proteolytic activation of HGF in injured tissues. Alternative splicing results in multiple variants encoding different isoforms. | NA |
| CCAAT/enhancer binding protein beta | 1051 | ENSG00000172216 | CEBPB | This intronless gene encodes a transcription factor that contains a basic leucine zipper (bZIP) domain. The encoded protein functions as a homodimer but can also form heterodimers with CCAAT/enhancer-binding proteins alpha, delta, and gamma. Activity of this protein is important in the regulation of genes involved in immune and inflammatory responses, among other processes. The use of alternative in-frame AUG start codons results in multiple protein isoforms, each with distinct biological functions. | NA |
| NA | ENSG00000269918 | ENSG00000269918 | AF131215.9 | NA | NA |
| aldehyde oxidase 1 | 316 | ENSG00000138356 | AOX1 | Aldehyde oxidase produces hydrogen peroxide and, under certain conditions, can catalyze the formation of superoxide. Aldehyde oxidase is a candidate gene for amyotrophic lateral sclerosis. | NA |
| platelet derived growth factor subunit A | 5154 | ENSG00000197461 | PDGFA | This gene encodes a member of the protein family comprised of both platelet-derived growth factors (PDGF) and vascular endothelial growth factors (VEGF). The encoded preproprotein is proteolytically processed to generate platelet-derived growth factor subunit A, which can homodimerize, or alternatively, heterodimerize with the related platelet-derived growth factor subunit B. These proteins bind and activate PDGF receptor tyrosine kinases, which play a role in a wide range of developmental processes. Alternative splicing results in multiple transcript variants. | NA |
| isovaleryl-CoA dehydrogenase | 3712 | ENSG00000128928 | IVD | Isovaleryl-CoA dehydrogenase (IVD) is a mitochondrial matrix enzyme that catalyzes the third step in leucine catabolism. The genetic deficiency of IVD results in an accumulation of isovaleric acid, which is toxic to the central nervous system and leads to isovaleric acidemia. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| amyloid beta precursor like protein 1 | 333 | ENSG00000105290 | APLP1 | This gene encodes a member of the highly conserved amyloid precursor protein gene family. The encoded protein is a membrane-associated glycoprotein that is cleaved by secretases in a manner similar to amyloid beta A4 precursor protein cleavage. This cleavage liberates an intracellular cytoplasmic fragment that may act as a transcriptional activator. The encoded protein may also play a role in synaptic maturation during cortical development. Alternatively spliced transcript variants encoding different isoforms have been described. | NA |
| NA | NA | ENSG00000241732 | NA | NA | TRUE |
| protein disulfide isomerase family A member 2 | 64714 | ENSG00000185615 | PDIA2 | Protein disulfide isomerases (EC 5.3.4.1), such as PDIP, are endoplasmic reticulum (ER) resident proteins that catalyze protein folding and thiol-disulfide interchange reactions (Desilva et al., 1996 [PubMed 8561901]). | NA |
| prostaglandin E synthase 3 (cytosolic)-like | 100885848 | ENSG00000267060 | PTGES3L | NA | NA |
| tandem C2 domains, nuclear | 123036 | ENSG00000165929 | TC2N | NA | NA |
| uncharacterized LOC105378272 | 105378272 | ENSG00000230555 | LOC105378272 | NA | NA |
| cytochrome p450 oxidoreductase | 5447 | ENSG00000127948 | POR | This gene encodes an endoplasmic reticulum membrane oxidoreductase with an FAD-binding domain and a flavodoxin-like domain. The protein binds two cofactors, FAD and FMN, which allow it to donate electrons directly from NADPH to all microsomal P450 enzymes. Mutations in this gene have been associated with various diseases, including apparent combined P450C17 and P450C21 deficiency, amenorrhea and disordered steroidogenesis, congenital adrenal hyperplasia and Antley-Bixler syndrome. | NA |
| major facilitator superfamily domain containing 4A | 148808 | ENSG00000174514 | MFSD4A | NA | NA |
| regulating synaptic membrane exocytosis 3 | 9783 | ENSG00000117016 | RIMS3 | NA | NA |
| RNA binding motif protein 11 | 54033 | ENSG00000185272 | RBM11 | NA | NA |
| immunoglobulin heavy constant alpha 1 | ENSG00000211895 | ENSG00000211895 | IGHA1 | NA | NA |
| haptoglobin | 3240 | ENSG00000257017 | HP | This gene encodes a preproprotein, which is processed to yield both alpha and beta chains, which subsequently combine as a tetramer to produce haptoglobin. Haptoglobin functions to bind free plasma hemoglobin, which allows degradative enzymes to gain access to the hemoglobin, while at the same time preventing loss of iron through the kidneys and protecting the kidneys from damage by hemoglobin. Mutations in this gene and/or its regulatory regions cause ahaptoglobinemia or hypohaptoglobinemia. This gene has also been linked to diabetic nephropathy, the incidence of coronary artery disease in type 1 diabetes, Crohn’s disease, inflammatory disease behavior, primary sclerosing cholangitis, susceptibility to idiopathic Parkinson’s disease, and a reduced incidence of Plasmodium falciparum malaria. The protein encoded also exhibits antimicrobial activity against bacteria. A similar duplicated gene is located next to this gene on chromosome 16. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| dual specificity phosphatase 23 | 54935 | ENSG00000158716 | DUSP23 | NA | NA |
| allograft inflammatory factor 1 like | 83543 | ENSG00000126878 | AIF1L | NA | NA |
| tumor protein p73 | 7161 | ENSG00000078900 | TP73 | This gene encodes a member of the p53 family of transcription factors involved in cellular responses to stress and development. It maps to a region on chromosome 1p36 that is frequently deleted in neuroblastoma and other tumors, and thought to contain multiple tumor suppressor genes. The demonstration that this gene is monoallelically expressed (likely from the maternal allele), supports the notion that it is a candidate gene for neuroblastoma. Many transcript variants resulting from alternative splicing and/or use of alternate promoters have been found for this gene, but the biological validity and the full-length nature of some variants have not been determined. | NA |
| uncharacterized LOC101930370 | 101930370 | ENSG00000245213 | LOC101930370 | NA | NA |
| plakophilin 3 | 11187 | ENSG00000184363 | PKP3 | This gene encodes a member of the arm-repeat (armadillo) and plakophilin gene families. Plakophilin proteins contain numerous armadillo repeats, localize to cell desmosomes and nuclei, and participate in linking cadherins to intermediate filaments in the cytoskeleton. This protein may act in cellular desmosome-dependent adhesion and signaling pathways. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| polypeptide N-acetylgalactosaminyltransferase 7 | 51809 | ENSG00000109586 | GALNT7 | This gene encodes GalNAc transferase 7, a member of the GalNAc-transferase family. The enzyme encoded by this gene controls the initiation step of mucin-type O-linked protein glycosylation and transfer of N-acetylgalactosamine to serine and threonine amino acid residues. This enzyme is a type II transmembrane protein and shares common sequence motifs with other family members. Unlike other family members, this enzyme shows exclusive specificity for partially GalNAc-glycosylated acceptor substrates and shows no activity with non-glycosylated peptides. This protein may function as a follow-up enzyme in the initiation step of O-glycosylation. | NA |
| neurexophilin 3 | 11248 | ENSG00000182575 | NXPH3 | NA | NA |
| steroidogenic acute regulatory protein | 6770 | ENSG00000147465 | STAR | The protein encoded by this gene plays a key role in the acute regulation of steroid hormone synthesis by enhancing the conversion of cholesterol into pregnenolone. This protein permits the cleavage of cholesterol into pregnenolone by mediating the transport of cholesterol from the outer mitochondrial membrane to the inner mitochondrial membrane. Mutations in this gene are a cause of congenital lipoid adrenal hyperplasia (CLAH), also called lipoid CAH. A pseudogene of this gene is located on chromosome 13. | NA |
| hydroxysteroid 11-beta dehydrogenase 1 | 3290 | ENSG00000117594 | HSD11B1 | The protein encoded by this gene is a microsomal enzyme that catalyzes the conversion of the stress hormone cortisol to the inactive metabolite cortisone. In addition, the encoded protein can catalyze the reverse reaction, the conversion of cortisone to cortisol. Too much cortisol can lead to central obesity, and a particular variation in this gene has been associated with obesity and insulin resistance in children. Mutations in this gene and H6PD (hexose-6-phosphate dehydrogenase (glucose 1-dehydrogenase)) are the cause of cortisone reductase deficiency. Alternate splicing results in multiple transcript variants encoding the same protein. | NA |
| placenta specific 8 | 51316 | ENSG00000145287 | PLAC8 | NA | NA |
| insulin receptor substrate 2 | 8660 | ENSG00000185950 | IRS2 | This gene encodes the insulin receptor substrate 2, a cytoplasmic signaling molecule that mediates effects of insulin, insulin-like growth factor 1, and other cytokines by acting as a molecular adaptor between diverse receptor tyrosine kinases and downstream effectors. The product of this gene is phosphorylated by the insulin receptor tyrosine kinase upon receptor stimulation, as well as by an interleukin 4 receptor-associated kinase in response to IL4 treatment. | NA |
| NA | NA | ENSG00000273281 | NA | NA | TRUE |
| ribosomal protein S3a pseudogene 47 | ENSG00000205871 | ENSG00000205871 | RPS3AP47 | NA | NA |
| leucine zipper and EF-hand containing transmembrane protein 2 | 137994 | ENSG00000165046 | LETM2 | NA | NA |
| hexose-6-phosphate dehydrogenase/glucose 1-dehydrogenase | 9563 | ENSG00000049239 | H6PD | There are 2 forms of glucose-6-phosphate dehydrogenase. G form is X-linked and H form, encoded by this gene, is autosomally linked. This H form shows activity with other hexose-6-phosphates, especially galactose-6-phosphate, whereas the G form is specific for glucose-6-phosphate. Both forms are present in most tissues, but H form is not found in red cells. | NA |
| atypical chemokine receptor 1 (Duffy blood group) | 2532 | ENSG00000213088 | ACKR1 | The protein encoded by this gene is a glycosylated membrane protein and a non-specific receptor for several chemokines. The encoded protein is the receptor for the human malarial parasites Plasmodium vivax and Plasmodium knowlesi. Polymorphisms in this gene are the basis of the Duffy blood group system. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| NA | NA | ENSG00000204807 | NA | NA | TRUE |
| asialoglycoprotein receptor 1 | 432 | ENSG00000141505 | ASGR1 | This gene encodes a subunit of the asialoglycoprotein receptor. This receptor is a transmembrane protein that plays a critical role in serum glycoprotein homeostasis by mediating the endocytosis and lysosomal degradation of glycoproteins with exposed terminal galactose or N-acetylgalactosamine residues. The asialoglycoprotein receptor may facilitate hepatic infection by multiple viruses including hepatitis B, and is also a target for liver-specific drug delivery. The asialoglycoprotein receptor is a hetero-oligomeric protein composed of major and minor subunits, which are encoded by different genes. The protein encoded by this gene is the more abundant major subunit. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| potassium calcium-activated channel subfamily N member 4 | 3783 | ENSG00000104783 | KCNN4 | The protein encoded by this gene is part of a potentially heterotetrameric voltage-independent potassium channel that is activated by intracellular calcium. Activation is followed by membrane hyperpolarization, which promotes calcium influx. The encoded protein may be part of the predominant calcium-activated potassium channel in T-lymphocytes. This gene is similar to other KCNN family potassium channel genes, but it differs enough to possibly be considered as part of a new subfamily. | NA |
| heparin binding EGF like growth factor | 1839 | ENSG00000113070 | HBEGF | NA | NA |
| myelin protein zero like 3 | 196264 | ENSG00000160588 | MPZL3 | NA | NA |
| dysbindin domain containing 2 | 55861 | ENSG00000244274 | DBNDD2 | NA | NA |
| potassium two pore domain channel subfamily K member 1 | 3775 | ENSG00000135750 | KCNK1 | This gene encodes one of the members of the superfamily of potassium channel proteins containing two pore-forming P domains. The product of this gene has not been shown to be a functional channel, however, it may require other non-pore-forming proteins for activity. | NA |
| serine peptidase inhibitor, Kazal type 1 | 6690 | ENSG00000164266 | SPINK1 | The protein encoded by this gene is a trypsin inhibitor, which is secreted from pancreatic acinar cells into pancreatic juice. It is thought to function in the prevention of trypsin-catalyzed premature activation of zymogens within the pancreas and the pancreatic duct. Mutations in this gene are associated with hereditary pancreatitis and tropical calcific pancreatitis. | NA |
| NA | ENSG00000248774 | ENSG00000248774 | RP11-798M19.3 | NA | NA |
| NA | ENSG00000264924 | ENSG00000264924 | RP11-799B12.2 | NA | NA |
| protease, serine 3 | 5646 | ENSG00000010438 | PRSS3 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is expressed in the brain and pancreas and is resistant to common trypsin inhibitors. It is active on peptide linkages involving the carboxyl group of lysine or arginine. This gene is localized to the locus of T cell receptor beta variable orphans on chromosome 9. Four transcript variants encoding different isoforms have been described for this gene. | NA |
| regenerating family member 1 beta | 5968 | ENSG00000172023 | REG1B | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV based on the primary structures of the encoded proteins. This gene encodes a protein secreted by the exocrine pancreas that is highly similar to the REG1A protein. The related REG1A protein is associated with islet cell regeneration and diabetogenesis, and may be involved in pancreatic lithogenesis. Reg family members REG1A, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | NA |
| glycerol-3-phosphate acyltransferase, mitochondrial | 57678 | ENSG00000119927 | GPAM | This gene encodes a mitochondrial enzyme which prefers saturated fatty acids as its substrate for the synthesis of glycerolipids. This metabolic pathway’s first step is catalyzed by the encoded enzyme. Two forms for this enzyme exist, one in the mitochondria and one in the endoplasmic reticulum. Two alternatively spliced transcript variants have been described for this gene. | NA |
| chymotrypsin like elastase family member 3B | 23436 | ENSG00000219073 | CELA3B | Elastases form a subfamily of serine proteases that hydrolyze many proteins in addition to elastin. Humans have six elastase genes which encode the structurally similar proteins elastase 1, 2, 2A, 2B, 3A, and 3B. Unlike other elastases, elastase 3B has little elastolytic activity. Like most of the human elastases, elastase 3B is secreted from the pancreas as a zymogen and, like other serine proteases such as trypsin, chymotrypsin and kallikrein, it has a digestive function in the intestine. Elastase 3B preferentially cleaves proteins after alanine residues. Elastase 3B may also function in the intestinal transport and metabolism of cholesterol. Both elastase 3A and elastase 3B have been referred to as protease E and as elastase 1, and excretion of this protein in fecal material is frequently used as a measure of pancreatic function in clinical assays. | NA |
| regenerating family member 1 alpha | 5967 | ENSG00000115386 | REG1A | This gene is a type I subclass member of the Reg gene family. The Reg gene family is a multigene family grouped into four subclasses, types I, II, III and IV, based on the primary structures of the encoded proteins. This gene encodes a protein that is secreted by the exocrine pancreas. It is associated with islet cell regeneration and diabetogenesis and may be involved in pancreatic lithogenesis. Reg family members REG1B, REGL, PAP and this gene are tandemly clustered on chromosome 2p12 and may have arisen from the same ancestral gene by gene duplication. | NA |
| lysozyme | 4069 | ENSG00000090382 | LYZ | This gene encodes human lysozyme, whose natural substrate is the bacterial cell wall peptidoglycan (cleaving the beta[1-4]glycosidic linkages between N-acetylmuramic acid and N-acetylglucosamine). Lysozyme is one of the antimicrobial agents found in human milk, and is also present in spleen, lung, kidney, white blood cells, plasma, saliva, and tears. The protein has antibacterial activity against a number of bacterial species. Missense mutations in this gene have been identified in heritable renal amyloidosis. | NA |
| retinol binding protein 4 | 5950 | ENSG00000138207 | RBP4 | This protein belongs to the lipocalin family and is the specific carrier for retinol (vitamin A alcohol) in the blood. It delivers retinol from the liver stores to the peripheral tissues. In plasma, the RBP-retinol complex interacts with transthyretin which prevents its loss by filtration through the kidney glomeruli. A deficiency of vitamin A blocks secretion of the binding protein posttranslationally and results in defective delivery and supply to the epidermal cells. | NA |
| NA | ENSG00000263335 | ENSG00000263335 | AF001548.5 | NA | NA |
| apolipoprotein C3 | 345 | ENSG00000110245 | APOC3 | Apolipoprotein C-III is a very low density lipoprotein (VLDL) protein. APOC3 inhibits lipoprotein lipase and hepatic lipase; it is thought to delay catabolism of triglyceride-rich particles. The APOA1, APOC3 and APOA4 genes are closely linked in both rat and human genomes. The A-I and A-IV genes are transcribed from the same strand, while the A-1 and C-III genes are convergently transcribed. An increase in apoC-III levels induces the development of hypertriglyceridemia. | NA |
| NA | ENSG00000244021 | ENSG00000244021 | RP11-50D9.1 | NA | NA |
| mucin 1, cell surface associated | 4582 | ENSG00000185499 | MUC1 | This gene encodes a membrane-bound protein that is a member of the mucin family. Mucins are O-glycosylated proteins that play an essential role in forming protective mucous barriers on epithelial surfaces. These proteins also play a role in intracellular signaling. This protein is expressed on the apical surface of epithelial cells that line the mucosal surfaces of many different tissues including lung, breast stomach and pancreas. This protein is proteolytically cleaved into alpha and beta subunits that form a heterodimeric complex. The N-terminal alpha subunit functions in cell-adhesion and the C-terminal beta subunit is involved in cell signaling. Overexpression, aberrant intracellular localization, and changes in glycosylation of this protein have been associated with carcinomas. This gene is known to contain a highly polymorphic variable number tandem repeats (VNTR) domain. Alternate splicing results in multiple transcript variants. | NA |
| zinc finger protein 738 | ENSG00000172687 | ENSG00000172687 | ZNF738 | NA | NA |
| NA | ENSG00000272512 | ENSG00000272512 | RP11-54O7.17 | NA | NA |
| WD repeat domain 76 | 79968 | ENSG00000092470 | WDR76 | NA | NA |
| joining chain of multimeric IgA and IgM | 3512 | ENSG00000132465 | JCHAIN | NA | NA |
| activating transcription factor 5 | 22809 | ENSG00000169136 | ATF5 | NA | NA |
| SRY-box 4 | 6659 | ENSG00000124766 | SOX4 | This intronless gene encodes a member of the SOX (SRY-related HMG-box) family of transcription factors involved in the regulation of embryonic development and in the determination of the cell fate. The encoded protein may act as a transcriptional regulator after forming a protein complex with other proteins, such as syndecan binding protein (syntenin). The protein may function in the apoptosis pathway leading to cell death as well as to tumorigenesis and may mediate downstream effects of parathyroid hormone (PTH) and PTH-related protein (PTHrP) in bone development. The solution structure has been resolved for the HMG-box of a similar mouse protein. | NA |
| STE20-related kinase adaptor beta | 55437 | ENSG00000082146 | STRADB | This gene encodes a protein that belongs to the serine/threonine protein kinase STE20 subfamily. One of the active site residues in the protein kinase domain of this protein is altered, and it is thus a pseudokinase. This protein is a component of a complex involved in the activation of serine/threonine kinase 11, a master kinase that regulates cell polarity and energy-generating metabolism. This complex regulates the relocation of this kinase from the nucleus to the cytoplasm, and it is essential for G1 cell cycle arrest mediated by this kinase. The protein encoded by this gene can also interact with the X chromosome-linked inhibitor of apoptosis protein, and this interaction enhances the anti-apoptotic activity of this protein via the JNK1 signal transduction pathway. Two pseudogenes, located on chromosomes 1 and 7, have been found for this gene. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| vitronectin | 7448 | ENSG00000109072 | VTN | The protein encoded by this gene is a member of the pexin family. It is found in serum and tissues and promotes cell adhesion and spreading, inhibits the membrane-damaging effect of the terminal cytolytic complement pathway, and binds to several serpin serine protease inhibitors. It is a secreted protein and exists in either a single chain form or a clipped, two chain form held together by a disulfide bond. | NA |
| threonine synthase like 2 | 55258 | ENSG00000144115 | THNSL2 | This gene encodes a threonine synthase-like protein. A similar enzyme in mouse can catalyze the degradation of O-phospho-homoserine to a-ketobutyrate, phosphate, and ammonia. This protein also has phospho-lyase activity on both gamma and beta phosphorylated substrates. In mouse an alternatively spliced form of this protein has been shown to act as a cytokine and can induce the production of the inflammatory cytokine IL6 in osteoblasts. Alternate splicing results in multiple transcript variants. | NA |
| protease, serine 1 | 5644 | ENSG00000204983 | PRSS1 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | NA |
| sperm antigen with calponin homology and coiled-coil domains 1 | 92521 | ENSG00000128487 | SPECC1 | The protein encoded by this gene belongs to the cytospin-A family. It is localized in the nucleus, and highly expressed in testis and some cancer cell lines. A chromosomal translocation involving this gene and platelet-derived growth factor receptor, beta gene (PDGFRB) may be a cause of juvenile myelomonocytic leukemia. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | NA |
| ribosomal protein L7 pseudogene 19 | ENSG00000241458 | ENSG00000241458 | RPL7P19 | NA | NA |
| troponin T2, cardiac type | 7139 | ENSG00000118194 | TNNT2 | The protein encoded by this gene is the tropomyosin-binding subunit of the troponin complex, which is located on the thin filament of striated muscles and regulates muscle contraction in response to alterations in intracellular calcium ion concentration. Mutations in this gene have been associated with familial hypertrophic cardiomyopathy as well as with dilated cardiomyopathy. Transcripts for this gene undergo alternative splicing that results in many tissue-specific isoforms, however, the full-length nature of some of these variants has not yet been determined. | NA |
| microsomal glutathione S-transferase 1 | 4257 | ENSG00000008394 | MGST1 | The MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism) family consists of six human proteins, two of which are involved in the production of leukotrienes and prostaglandin E, important mediators of inflammation. Other family members, demonstrating glutathione S-transferase and peroxidase activities, are involved in cellular defense against toxic, carcinogenic, and pharmacologically active electrophilic compounds. This gene encodes a protein that catalyzes the conjugation of glutathione to electrophiles and the reduction of lipid hydroperoxides. This protein is localized to the endoplasmic reticulum and outer mitochondrial membrane where it is thought to protect these membranes from oxidative stress. Several transcript variants, some non-protein coding and some protein coding, have been found for this gene. | NA |
| small nucleolar RNA host gene 12 | 85028 | ENSG00000197989 | SNHG12 | NA | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",17,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[18,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| summary | X_id | query | symbol | name | notfound |
|---|---|---|---|---|---|
| The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. The use of alternate polyadenylation sites has been found for this gene. | 23555 | ENSG00000099282 | TSPAN15 | tetraspanin 15 | NA |
| Tryptases comprise a family of trypsin-like serine proteases, the peptidase family S1. Tryptases are enzymatically active only as heparin-stabilized tetramers, and they are resistant to all known endogenous proteinase inhibitors. Several tryptase genes are clustered on chromosome 16p13.3. These genes are characterized by several distinct features. They have a highly conserved 3’ UTR and contain tandem repeat sequences at the 5’ flank and 3’ UTR which are thought to play a role in regulation of the mRNA stability. These genes have an intron immediately upstream of the initiator Met codon, which separates the site of transcription initiation from protein coding sequence. This feature is characteristic of tryptases but is unusual in other genes. The alleles of this gene exhibit an unusual amount of sequence variation, such that the alleles were once thought to represent two separate genes, alpha and beta 1. Beta tryptases appear to be the main isoenzymes expressed in mast cells; whereas in basophils, alpha tryptases predominate. Tryptases have been implicated as mediators in the pathogenesis of asthma and other allergic and inflammatory disorders. | 7177 | ENSG00000172236 | TPSAB1 | tryptase alpha/beta 1 | NA |
| This gene encodes a preproprotein that is proteolytically processed to generate a secreted peptide that belongs to the endothelin/sarafotoxin family. This peptide is a potent vasoconstrictor and its cognate receptors are therapeutic targets in the treatment of pulmonary arterial hypertension. Aberrant expression of this gene may promote tumorigenesis. Alternative splicing results in multiple transcript variants. | 1906 | ENSG00000078401 | EDN1 | endothelin 1 | NA |
| There are believed to be over 100 different glycosyltransferases involved in the synthesis of protein-bound and lipid-bound oligosaccharides. The enzyme encoded by this gene transfers a GlcNAc residue to the beta-linked mannose of the trimannosyl core of N-linked oligosaccharides and produces a bisecting GlcNAc. Multiple alternatively spliced variants, encoding the same protein, have been identified. | 4248 | ENSG00000128268 | MGAT3 | mannosyl (beta-1,4-)-glycoprotein beta-1,4-N-acetylglucosaminyltransferase | NA |
| NA | 114804 | ENSG00000141576 | RNF157 | ring finger protein 157 | NA |
| The protein encoded by this gene is a small secreted cysteine-rich protein and a member of the CCN family of regulatory proteins. CNN family proteins associate with the extracellular matrix and play an important role in cardiovascular and skeletal development, fibrosis and cancer development. | 4856 | ENSG00000136999 | NOV | nephroblastoma overexpressed | NA |
| This gene encodes a secreted, homodimeric glycoprotein that is expressed in a wide variety of tissues and may have autocrine or paracrine functions. The encoded protein has 10 of its 15 cysteine residues conserved among stanniocalcin family members and is phosphorylated by casein kinase 2 exclusively on its serine residues. Its C-terminus contains a cluster of histidine residues which may interact with metal ions. The protein may play a role in the regulation of renal and intestinal calcium and phosphate transport, cell metabolism, or cellular calcium/phosphate homeostasis. Constitutive overexpression of human stanniocalcin 2 in mice resulted in pre- and postnatal growth restriction, reduced bone and skeletal muscle growth, and organomegaly. Expression of this gene is induced by estrogen and altered in some breast cancers. | 8614 | ENSG00000113739 | STC2 | stanniocalcin 2 | NA |
| NA | 11170 | ENSG00000168309 | FAM107A | family with sequence similarity 107 member A | NA |
| This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum. It has both 17alpha-hydroxylase and 17,20-lyase activities and is a key enzyme in the steroidogenic pathway that produces progestins, mineralocorticoids, glucocorticoids, androgens, and estrogens. Mutations in this gene are associated with isolated steroid-17 alpha-hydroxylase deficiency, 17-alpha-hydroxylase/17,20-lyase deficiency, pseudohermaphroditism, and adrenal hyperplasia. | 1586 | ENSG00000148795 | CYP17A1 | cytochrome P450 family 17 subfamily A member 1 | NA |
| The adhesion G-protein-coupled receptors (GPCRs), including GPR133, are membrane-bound proteins with long N termini containing multiple domains. GPCRs, or GPRs, contain 7 transmembrane domains and transduce extracellular signals through heterotrimeric G proteins (summary by Bjarnadottir et al., 2004 [PubMed 15203201]). | 283383 | ENSG00000111452 | ADGRD1 | adhesion G protein-coupled receptor D1 | NA |
| NA | 100127888 | ENSG00000232803 | SLCO4A1-AS1 | SLCO4A1 antisense RNA 1 | NA |
| NA | 28231 | ENSG00000101187 | SLCO4A1 | solute carrier organic anion transporter family member 4A1 | NA |
| FABP4 encodes the fatty acid binding protein found in adipocytes. Fatty acid binding proteins are a family of small, highly conserved, cytoplasmic proteins that bind long-chain fatty acids and other hydrophobic ligands. It is thought that FABPs roles include fatty acid uptake, transport, and metabolism. | 2167 | ENSG00000170323 | FABP4 | fatty acid binding protein 4 | NA |
| NA | ENSG00000254429 | ENSG00000254429 | CTD-2562J17.7 | NA | NA |
| NA | ENSG00000272789 | ENSG00000272789 | RP11-286H15.1 | NA | NA |
| The protein encoded by this gene belongs to the centaurin gamma-like family. It mediates anti-apoptotic effects of nerve growth factor by activating nuclear phosphoinositide 3-kinase. It is overexpressed in cancer cells, and promotes cancer cell invasion. Alternatively spliced transcript variants encoding different isoforms have been described for this gene. | 116986 | ENSG00000135439 | AGAP2 | ArfGAP with GTPase domain, ankyrin repeat and PH domain 2 | NA |
| NA | ENSG00000273055 | ENSG00000273055 | CTB-13F3.1 | NA | NA |
| STXBP6 binds components of the SNARE complex (see MIM 603215) and may be involved in regulating SNARE complex formation (Scales et al., 2002 [PubMed 12145319]). | 29091 | ENSG00000168952 | STXBP6 | syntaxin binding protein 6 | NA |
| Cytochrome c oxidase (COX), the terminal enzyme of the mitochondrial respiratory chain, catalyzes the electron transfer from reduced cytochrome c to oxygen. It is a heteromeric complex consisting of 3 catalytic subunits encoded by mitochondrial genes and multiple structural subunits encoded by nuclear genes. The mitochondrially-encoded subunits function in electron transfer, and the nuclear-encoded subunits may be involved in the regulation and assembly of the complex. This nuclear gene encodes isoform 2 of subunit IV. Isoform 1 of subunit IV is encoded by a different gene, however, the two genes show a similar structural organization. Subunit IV is the largest nuclear encoded subunit which plays a pivotal role in COX regulation. | 84701 | ENSG00000131055 | COX4I2 | cytochrome c oxidase subunit 4I2 | NA |
| NA | ENSG00000267128 | ENSG00000267128 | RP11-449J21.5 | NA | NA |
| NA | ENSG00000258999 | ENSG00000258999 | RP11-114N19.3 | NA | NA |
| This gene encodes a member of the hypoxia inducible gene 1 (HIG1) domain family. The encoded protein is localized to the cell membrane and has been linked to tumorigenesis and the progression of pituitary adenomas. Alternative splicing results in multiple transcript variants. | 51751 | ENSG00000131097 | HIGD1B | HIG1 hypoxia inducible domain family member 1B | NA |
| NA | 148808 | ENSG00000174514 | MFSD4A | major facilitator superfamily domain containing 4A | NA |
| This gene encodes a member of the cell death-inducing DNA fragmentation factor-like effector family. Members of this family play important roles in apoptosis. The encoded protein promotes lipid droplet formation in adipocytes and may mediate adipocyte apoptosis. This gene is regulated by insulin and its expression is positively correlated with insulin sensitivity. Mutations in this gene may contribute to insulin resistant diabetes. A pseudogene of this gene is located on the short arm of chromosome 3. Alternatively spliced transcript variants that encode different isoforms have been observed for this gene. | 63924 | ENSG00000187288 | CIDEC | cell death inducing DFFA like effector c | NA |
| NA | 100873993 | ENSG00000239799 | ITIH4-AS1 | ITIH4 antisense RNA 1 | NA |
| NA | 80150 | ENSG00000162174 | ASRGL1 | asparaginase like 1 | NA |
| Integrins are heterodimers comprised of alpha and beta subunits, that are noncovalently associated transmembrane glycoprotein receptors. Different combinations of alpha and beta polypeptides form complexes that vary in their ligand-binding specificities. Integrins mediate cell-matrix or cell-cell adhesion, and transduced signals that regulate gene expression and cell growth. This gene encodes the integrin beta 4 subunit, a receptor for the laminins. This subunit tends to associate with alpha 6 subunit and is likely to play a pivotal role in the biology of invasive carcinoma. Mutations in this gene are associated with epidermolysis bullosa with pyloric atresia. Multiple alternatively spliced transcript variants encoding distinct isoforms have been found for this gene. | 3691 | ENSG00000132470 | ITGB4 | integrin subunit beta 4 | NA |
| NA | NA | ENSG00000272016 | NA | NA | TRUE |
| This gene encodes the alpha chain of type XVIII collagen. This collagen is one of the multiplexins, extracellular matrix proteins that contain multiple triple-helix domains (collagenous domains) interrupted by non-collagenous domains. A long isoform of the protein has an N-terminal domain that is homologous to the extracellular part of frizzled receptors. Proteolytic processing at several endogenous cleavage sites in the C-terminal domain results in production of endostatin, a potent antiangiogenic protein that is able to inhibit angiogenesis and tumor growth. Mutations in this gene are associated with Knobloch syndrome. The main features of this syndrome involve retinal abnormalities, so type XVIII collagen may play an important role in retinal structure and in neural tube closure. Alternative splicing results in multiple transcript variants. | 80781 | ENSG00000182871 | COL18A1 | collagen type XVIII alpha 1 chain | NA |
| NA | ENSG00000255126 | ENSG00000255126 | CTD-2531D15.5 | NA | NA |
| NA | 221711 | ENSG00000153157 | SYCP2L | synaptonemal complex protein 2 like | NA |
| NA | ENSG00000259352 | ENSG00000259352 | RP11-109D20.2 | NA | NA |
| This gene encodes a member of the protein family comprised of both platelet-derived growth factors (PDGF) and vascular endothelial growth factors (VEGF). The encoded preproprotein is proteolytically processed to generate platelet-derived growth factor subunit B, which can homodimerize, or alternatively, heterodimerize with the related platelet-derived growth factor subunit A. These proteins bind and activate PDGF receptor tyrosine kinases, which play a role in a wide range of developmental processes. Mutations in this gene are associated with meningioma. Reciprocal translocations between chromosomes 22 and 17, at sites where this gene and that for collagen type 1, alpha 1 are located, are associated with dermatofibrosarcoma protuberans, a rare skin tumor. Alternative splicing results in multiple transcript variants. | 5155 | ENSG00000100311 | PDGFB | platelet derived growth factor subunit B | NA |
| NA | ENSG00000255118 | ENSG00000255118 | RP11-703H8.7 | NA | NA |
| The protein encoded by this gene stimulates the activity of several transcription factors and nuclear receptors, including estrogen receptor alpha, nuclear respiratory factor 1, and glucocorticoid receptor. The encoded protein may be involved in fat oxidation, non-oxidative glucose metabolism, and the regulation of energy expenditure. This protein is downregulated in prediabetic and type 2 diabetes mellitus patients. Certain allelic variations in this gene increase the risk of the development of obesity. Three transcript variants encoding different isoforms have been found for this gene. | 133522 | ENSG00000155846 | PPARGC1B | PPARG coactivator 1 beta | NA |
| NA | 57558 | ENSG00000118369 | USP35 | ubiquitin specific peptidase 35 | NA |
| NA | 150763 | ENSG00000186281 | GPAT2 | glycerol-3-phosphate acyltransferase 2, mitochondrial | NA |
| NA | ENSG00000259479 | ENSG00000259479 | SORD2P | sorbitol dehydrogenase 2, pseudogene | NA |
| Major alterations in the composition of the cartilage extracellular matrix occur in joint disease, such as osteoarthrosis. This gene encodes the cartilage intermediate layer protein (CILP), which increases in early osteoarthrosis cartilage. The encoded protein was thought to encode a protein precursor for two different proteins; an N-terminal CILP and a C-terminal homolog of NTPPHase, however, later studies identified no nucleotide pyrophosphatase phosphodiesterase (NPP) activity. The full-length and the N-terminal domain of this protein was shown to function as an IGF-1 antagonist. An allelic variant of this gene has been associated with lumbar disc disease. | 8483 | ENSG00000138615 | CILP | cartilage intermediate layer protein | NA |
| This gene encodes a bifunctional signal transduction molecule. Dopaminergic and glutamatergic receptor stimulation regulates its phosphorylation and function as a kinase or phosphatase inhibitor. As a target for dopamine, this gene may serve as a therapeutic target for neurologic and psychiatric disorders. Multiple transcript variants encoding different isoforms have been found for this gene. | 84152 | ENSG00000131771 | PPP1R1B | protein phosphatase 1 regulatory inhibitor subunit 1B | NA |
| NA | ENSG00000237276 | ENSG00000237276 | ANO7P1 | anoctamin 7 pseudogene 1 | NA |
| Members of the CELF/BRUNOL protein family contain two N-terminal RNA recognition motif (RRM) domains, one C-terminal RRM domain, and a divergent segment of 160-230 aa between the second and third RRM domains. Members of this protein family regulate pre-mRNA alternative splicing and may also be involved in mRNA editing, and translation. Alternative splicing results in multiple transcript variants encoding different isoforms. | 10659 | ENSG00000048740 | CELF2 | CUGBP, Elav-like family member 2 | NA |
| NA | ENSG00000259772 | ENSG00000259772 | RP11-16E12.2 | NA | NA |
| This gene encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the mitochondrial inner membrane and is involved in the conversion of progesterone to cortisol in the adrenal cortex. Mutations in this gene cause congenital adrenal hyperplasia due to 11-beta-hydroxylase deficiency. Transcript variants encoding different isoforms have been noted for this gene. | 1584 | ENSG00000160882 | CYP11B1 | cytochrome P450 family 11 subfamily B member 1 | NA |
| NA | ENSG00000267396 | ENSG00000267396 | RP11-845C23.3 | NA | NA |
| The protein encoded by this gene is a member of the keratin gene family. The keratins are intermediate filament proteins responsible for the structural integrity of epithelial cells and are subdivided into cytokeratins and hair keratins. Most of the type I cytokeratins consist of acidic proteins which are arranged in pairs of heterotypic keratin chains and are clustered in a region of chromosome 17q12-q21. This keratin has been coexpressed with keratin 14 in a number of epithelial tissues, including esophagus, tongue, and hair follicles. Mutations in this gene are associated with type 1 pachyonychia congenita, non-epidermolytic palmoplantar keratoderma and unilateral palmoplantar verrucous nevus. | 3868 | ENSG00000186832 | KRT16 | keratin 16 | NA |
| NA | 359845 | ENSG00000183688 | FAM101B | family with sequence similarity 101 member B | NA |
| NA | 5140 | ENSG00000152270 | PDE3B | phosphodiesterase 3B | NA |
| The protein encoded by this gene belongs to the innexin family. Innexin family members are the structural components of gap junctions. This protein and pannexin 1 are abundantly expressed in central nervous system (CNS) and are coexpressed in various neuronal populations. Studies in Xenopus oocytes suggest that this protein alone and in combination with pannexin 1 may form cell type-specific gap junctions with distinct properties. Multiple transcript variants encoding different isoforms have been found for this gene. | 56666 | ENSG00000073150 | PANX2 | pannexin 2 | NA |
| This gene encodes a member of a family of proteins that function as negative regulators of Wnt receptor signaling through interaction with Dishevelled family members. The encoded protein participates in the delivery of transforming growth factor alpha-containing vesicles to the cell membrane. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 85409 | ENSG00000145506 | NKD2 | naked cuticle homolog 2 | NA |
| The protein encoded by this gene is a cytokine receptor that belongs to the interleukin 1 receptor family. This receptor specifically binds interleukin 18 (IL18), and is essential for IL18 mediated signal transduction. IFN-alpha and IL12 are reported to induce the expression of this receptor in NK and T cells. This gene along with four other members of the interleukin 1 receptor family, including IL1R2, IL1R1, ILRL2 (IL-1Rrp2), and IL1RL1 (T1/ST2), form a gene cluster on chromosome 2q. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | 8809 | ENSG00000115604 | IL18R1 | interleukin 18 receptor 1 | NA |
| NA | 441869 | ENSG00000235098 | ANKRD65 | ankyrin repeat domain 65 | NA |
| NA | ENSG00000225972 | ENSG00000225972 | MTND1P23 | mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 1 pseudogene 23 | NA |
| Steroid 5-alpha-reductase (EC 1.3.99.5) catalyzes the conversion of testosterone into the more potent androgen, dihydrotestosterone (DHT). Also see SRD5A2 (MIM 607306). | 6715 | ENSG00000145545 | SRD5A1 | steroid 5 alpha-reductase 1 | NA |
| The protein encoded by this gene is a secretory protein that contains a hyaluronan-binding domain, and thus is a member of the hyaluronan-binding protein family. The hyaluronan-binding domain is known to be involved in extracellular matrix stability and cell migration. This protein has been shown to form a stable complex with inter-alpha-inhibitor (I alpha I), and thus enhance the serine protease inhibitory activity of I alpha I, which is important in the protease network associated with inflammation. This gene can be induced by proinflammatory cytokines such as tumor necrosis factor alpha and interleukin-1. Enhanced levels of this protein are found in the synovial fluid of patients with osteoarthritis and rheumatoid arthritis. | 7130 | ENSG00000123610 | TNFAIP6 | TNF alpha induced protein 6 | NA |
| NA | ENSG00000272077 | ENSG00000272077 | RP11-348P10.2 | NA | NA |
| This gene encodes a glycoprotein involved in hemostasis. The encoded preproprotein is proteolytically processed following assembly into large multimeric complexes. These complexes function in the adhesion of platelets to sites of vascular injury and the transport of various proteins in the blood. Mutations in this gene result in von Willebrand disease, an inherited bleeding disorder. An unprocessed pseudogene has been found on chromosome 22. | 7450 | ENSG00000110799 | VWF | von Willebrand factor | NA |
| This gene is one of several genes encoding pulmonary-surfactant associated proteins (SFTPA) located on chromosome 10. Mutations in this gene and a highly similar gene located nearby, which affect the highly conserved carbohydrate recognition domain, are associated with idiopathic pulmonary fibrosis. The current version of the assembly displays only a single centromeric SFTPA gene pair rather than the two gene pairs shown in the previous assembly which were thought to have resulted from a duplication. | 729238 | ENSG00000185303 | SFTPA2 | surfactant protein A2 | NA |
| The protein encoded by this gene is an adenosine receptor that belongs to the G-protein coupled receptor 1 family. There are 3 types of adenosine receptors, each with a specific pattern of ligand binding and tissue distribution, and together they regulate a diverse set of physiologic functions. The type A1 receptors inhibit adenylyl cyclase, and play a role in the fertilization process. Animal studies also suggest a role for A1 receptors in kidney function and ethanol intoxication. Transcript variants with alternative splicing in the 5’ UTR have been found for this gene. | 134 | ENSG00000163485 | ADORA1 | adenosine A1 receptor | NA |
| This gene encodes a nuclear protein belonging to the hairy and enhancer of split-related (HESR) family of basic helix-loop-helix (bHLH)-type transcriptional repressors. Expression of this gene is induced by the Notch and c-Jun signal transduction pathways. Two similar and redundant genes in mouse are required for embryonic cardiovascular development, and are also implicated in neurogenesis and somitogenesis. Alternative splicing results in multiple transcript variants. | 23462 | ENSG00000164683 | HEY1 | hes related family bHLH transcription factor with YRPW motif 1 | NA |
| NA | 84541 | ENSG00000163376 | KBTBD8 | kelch repeat and BTB domain containing 8 | NA |
| NA | 8503 | ENSG00000117461 | PIK3R3 | phosphoinositide-3-kinase regulatory subunit 3 | NA |
| NA | 220963 | ENSG00000165449 | SLC16A9 | solute carrier family 16 member 9 | NA |
| NA | NA | ENSG00000257499 | NA | NA | TRUE |
| Sorbitol dehydrogenase (SORD; EC 1.1.1.14) catalyzes the interconversion of polyols and their corresponding ketoses, and together with aldose reductase (ALDR1; MIM 103880), makes up the sorbitol pathway that is believed to play an important role in the development of diabetic complications (summarized by Carr and Markham, 1995 [PubMed 8535074]). The first reaction of the pathway (also called the polyol pathway) is the reduction of glucose to sorbitol by ALDR1 with NADPH as the cofactor. SORD then oxidizes the sorbitol to fructose using NAD(+) cofactor. | 6652 | ENSG00000140263 | SORD | sorbitol dehydrogenase | NA |
| NA | 400684 | ENSG00000267213 | LOC400684 | uncharacterized LOC400684 | NA |
| The protein encoded by this gene has a long and a short form, generated by use of alternative translational start codons. The long form is expressed in steroidogenic tissues such as testis, where it converts cholesteryl esters to free cholesterol for steroid hormone production. The short form is expressed in adipose tissue, among others, where it hydrolyzes stored triglycerides to free fatty acids. | 3991 | ENSG00000079435 | LIPE | lipase E, hormone sensitive type | NA |
| This gene encodes a protein that contains several helicase family domains. Mutations in this gene have been found in some patients with the CHARGE syndrome. Two transcript variants encoding different isoforms have been found for this gene. | 55636 | ENSG00000171316 | CHD7 | chromodomain helicase DNA binding protein 7 | NA |
| The protein encoded by this gene is a cell membrane protein that may be involved in iron export from duodenal epithelial cells. Defects in this gene are a cause of hemochromatosis type 4 (HFE4). | 30061 | ENSG00000138449 | SLC40A1 | solute carrier family 40 member 1 | NA |
| NA | ENSG00000217648 | ENSG00000217648 | RP1-95L4.4 | NA | NA |
| This gene encodes a member of a family of proteins that contain coiled-coil domains and may form hetero- or homomers. The encoded protein is involved in cell proliferation and calcium signaling. It also interacts with the mitogen-activated protein kinase kinase kinase 5 (MAP3K5/ASK1) and positively regulates MAP3K5-induced apoptosis. Multiple alternatively spliced transcript variants have been observed. | 7164 | ENSG00000111907 | TPD52L1 | tumor protein D52-like 1 | NA |
| NA | ENSG00000262663 | ENSG00000262663 | RP11-497H17.1 | NA | NA |
| This gene belongs to the short chain dehydrogenase/reductase superfamily. It encodes a reductase enzyme involved in the first step of wax biosynthesis wherein fatty acids are converted to fatty alcohols. The encoded peroxisomal protein utilizes saturated fatty acids of 16 or 18 carbons as preferred substrates. Alternatively spliced transcript variants have been observed for this gene. Related pseudogenes have been identified on chromosomes 2, 14 and 22. | 55711 | ENSG00000064763 | FAR2 | fatty acyl-CoA reductase 2 | NA |
| This gene encodes the enzyme responsible for hydrolysis of both HIBYL-CoA and beta-hydroxypropionyl-CoA. Mutations in this gene have been associated with 3-hyroxyisobutyryl-CoA hydrolase deficiency. Alternative splicing results in multiple transcript variants. | 26275 | ENSG00000198130 | HIBCH | 3-hydroxyisobutyryl-CoA hydrolase | NA |
| NA | 113115 | ENSG00000146410 | MTFR2 | mitochondrial fission regulator 2 | NA |
| NA | 55228 | ENSG00000182013 | PNMAL1 | paraneoplastic Ma antigen family-like 1 | NA |
| NA | NA | ENSG00000175898 | NA | NA | TRUE |
| The protein encoded by this gene mediates sodium and chloride transport and reabsorption. The encoded protein is a membrane protein and is important in maintaining proper ionic balance and cell volume. This protein is phosphorylated in response to DNA damage. Three transcript variants encoding two different isoforms have been found for this gene. | 6558 | ENSG00000064651 | SLC12A2 | solute carrier family 12 member 2 | NA |
| This gene encodes a member of the NipSnap family of proteins that may be involved in vesicular transport. A similar protein in mice inhibits the calcium channel TRPV6, and is also localized to the inner mitochondrial membrane where it may play a role in mitochondrial DNA maintenance. A pseudogene of this gene is located on the short arm of chromosome 17. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 8508 | ENSG00000184117 | NIPSNAP1 | nipsnap homolog 1 (C. elegans) | NA |
| This gene encodes a member of the tumor necrosis factor receptor superfamily. The encoded protein activates nuclear factor kappa-B and mitogen-activated protein kinase 8 (also called c-Jun N-terminal kinase 1), and induces cell apoptosis. Through its death domain, the encoded receptor interacts with tumor necrosis factor receptor type 1-associated death domain (TRADD) protein, which is known to mediate signal transduction of tumor necrosis factor receptors. Knockout studies in mice suggest that this gene plays a role in T-helper cell activation, and may be involved in inflammation and immune regulation. | 27242 | ENSG00000146072 | TNFRSF21 | tumor necrosis factor receptor superfamily member 21 | NA |
| NA | ENSG00000272668 | ENSG00000272668 | RP11-190A12.8 | NA | NA |
| NA | ENSG00000250899 | ENSG00000250899 | RP11-253E3.3 | NA | NA |
| NA | 28978 | ENSG00000096092 | TMEM14A | transmembrane protein 14A | NA |
| NA | 84866 | ENSG00000149582 | TMEM25 | transmembrane protein 25 | NA |
| This gene encodes a member of the Nedd4 family of HECT domain E3 ubiquitin ligases. HECT domain E3 ubiquitin ligases transfer ubiquitin from E2 ubiquitin-conjugating enzymes to protein substrates, thus targeting specific proteins for lysosomal degradation. The encoded protein mediates the ubiquitination of multiple target substrates and plays a critical role in epithelial sodium transport by regulating the cell surface expression of the epithelial sodium channel, ENaC. Single nucleotide polymorphisms in this gene may be associated with essential hypertension. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 23327 | ENSG00000049759 | NEDD4L | neural precursor cell expressed, developmentally down-regulated 4-like, E3 ubiquitin protein ligase | NA |
| Histones are basic nuclear proteins that are responsible for the nucleosome structure of the chromosomal fiber in eukaryotes. Nucleosomes consist of approximately 146 bp of DNA wrapped around a histone octamer composed of pairs of each of the four core histones (H2A, H2B, H3, and H4). The chromatin fiber is further compacted through the interaction of a linker histone, H1, with the DNA between the nucleosomes to form higher order chromatin structures. This gene is intronless and encodes a replication-dependent histone that is a member of the histone H2B family. Two transcripts that encode the same protein have been identified for this gene, which is found in the large histone gene cluster on chromosome 6p22-p21.3. | 3017 | ENSG00000158373 | HIST1H2BD | histone cluster 1, H2bd | NA |
| NA | ENSG00000256072 | ENSG00000256072 | RP11-335I12.2 | NA | NA |
| IGSF4B is a brain-specific protein related to the calcium-independent cell-cell adhesion molecules known as nectins (see PVRL3; MIM 607147) (Kakunaga et al., 2005 [PubMed 15741237]). | 57863 | ENSG00000162706 | CADM3 | cell adhesion molecule 3 | NA |
| NA | ENSG00000251196 | ENSG00000251196 | RP11-54F2.1 | NA | NA |
| The protein encoded by this gene belongs to the thrombospondin protein family. Thrombospondin family members are adhesive glycoproteins that mediate cell-to-cell and cell-to-matrix interactions. This protein forms a pentamer and can bind to heparin and calcium. It is involved in local signaling in the developing and adult nervous system, and it contributes to spinal sensitization and neuropathic pain states. This gene is activated during the stromal response to invasive breast cancer. It may also play a role in inflammatory responses in Alzheimer’s disease. Alternative splicing results in multiple transcript variants. | 7060 | ENSG00000113296 | THBS4 | thrombospondin 4 | NA |
| RGMB is a glycosylphosphatidylinositol (GPI)-anchored member of the repulsive guidance molecule family (see RGMA, MIM 607362) and contributes to the patterning of the developing nervous system (Samad et al., 2005 [PubMed 15671031]). | 285704 | ENSG00000174136 | RGMB | repulsive guidance molecule family member b | NA |
| This gene encodes a member of the KH-domain protein subfamily. Proteins of this subfamily, also referred to as alpha-CPs, bind to RNA with a specificity for C-rich pyrimidine regions. Alpha-CPs play important roles in post-transcriptional activities and have different cellular distributions. This gene’s protein is found in the cytoplasm, yet it lacks the nuclear localization signals found in other subfamily members. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | 54039 | ENSG00000183570 | PCBP3 | poly(rC) binding protein 3 | NA |
| NA | ENSG00000261428 | ENSG00000261428 | RP11-16P6.1 | NA | NA |
| Tight junctions represent one mode of cell-to-cell adhesion in epithelial or endothelial cell sheets, forming continuous seals around cells and serving as a physical barrier to prevent solutes and water from passing freely through the paracellular space. These junctions are comprised of sets of continuous networking strands in the outwardly facing cytoplasmic leaflet, with complementary grooves in the inwardly facing extracytoplasmic leaflet. The protein encoded by this gene, a member of the claudin family, is an integral membrane protein and a component of tight junction strands. Loss of function mutations result in neonatal ichthyosis-sclerosing cholangitis syndrome. | 9076 | ENSG00000163347 | CLDN1 | claudin 1 | NA |
| NA | 100289388 | ENSG00000246174 | KCTD21-AS1 | KCTD21 antisense RNA 1 | NA |
| NA | ENSG00000271833 | ENSG00000271833 | RP11-356B19.11 | NA | NA |
| The protein encoded by this gene is a member of the ros/insulin receptor family of tyrosine kinases. Tyrosine-specific phosphorylation of proteins is a key to the control of diverse pathways leading to cell growth and differentiation. Multiple transcript variants encoding different isoforms have been found for this gene. | 4058 | ENSG00000062524 | LTK | leukocyte receptor tyrosine kinase | NA |
| The protein encoded by this gene is a cis-Golgi transmembrane protein that may be necessary for the long-term survival of nociceptive and autonomic ganglion neurons. Mutations in this gene are a cause of hereditary sensory and autonomic neuropathy type IIB (HSAN IIB), and this gene may also play a role in susceptibility to vascular dementia. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | 54463 | ENSG00000154153 | FAM134B | family with sequence similarity 134 member B | NA |
| This gene encodes a subunit of a cytokine that acts on T and natural killer cells, and has a broad array of biological activities. The cytokine is a disulfide-linked heterodimer composed of the 35-kD subunit encoded by this gene, and a 40-kD subunit that is a member of the cytokine receptor family. This cytokine is required for the T-cell-independent induction of interferon (IFN)-gamma, and is important for the differentiation of both Th1 and Th2 cells. The responses of lymphocytes to this cytokine are mediated by the activator of transcription protein STAT4. Nitric oxide synthase 2A (NOS2A/NOS2) is found to be required for the signaling process of this cytokine in innate immunity. | 3592 | ENSG00000168811 | IL12A | interleukin 12A | NA |
| This gene encodes a member of the F-box protein family, members of which are characterized by an approximately 40 amino acid motif, the F-box. The F-box proteins constitute one of the four subunits of ubiquitin protein ligase complex called SCFs (SKP1-cullin-F-box), which function in phosphorylation-dependent ubiquitination. The F-box proteins are divided into three classes: Fbws containing WD-40 domains, Fbls containing leucine-rich repeats, and Fbxs containing either different protein-protein interaction modules or no recognizable motifs. The protein encoded by this gene belongs to the Fbx class. Multiple transcript variants encoding different isoforms have been found for this gene. | 157574 | ENSG00000214050 | FBXO16 | F-box protein 16 | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",18,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[19,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| name | X_id | symbol | query | summary | notfound |
|---|---|---|---|---|---|
| myosin light chain, phosphorylatable, fast skeletal muscle | 29895 | MYLPF | ENSG00000180209 | NA | NA |
| troponin I2, fast skeletal type | 7136 | TNNI2 | ENSG00000130598 | This gene encodes a fast-twitch skeletal muscle protein, a member of the troponin I gene family, and a component of the troponin complex including troponin T, troponin C and troponin I subunits. The troponin complex, along with tropomyosin, is responsible for the calcium-dependent regulation of striated muscle contraction. Mouse studies show that this component is also present in vascular smooth muscle and may play a role in regulation of smooth muscle function. In addition to muscle tissues, this protein is found in corneal epithelium, cartilage where it is an inhibitor of angiogenesis to inhibit tumor growth and metastasis, and mammary gland where it functions as a co-activator of estrogen receptor-related receptor alpha. This protein also suppresses tumor growth in human ovarian carcinoma. Mutations in this gene cause myopathy and distal arthrogryposis type 2B. Alternatively spliced transcript variants have been found for this gene. | NA |
| myosin, heavy chain 1, skeletal muscle, adult | 4619 | MYH1 | ENSG00000109061 | Myosin is a major contractile protein which converts chemical energy into mechanical energy through the hydrolysis of ATP. Myosin is a hexameric protein composed of a pair of myosin heavy chains (MYH) and two pairs of nonidentical light chains. Myosin heavy chains are encoded by a multigene family. In mammals at least 10 different myosin heavy chain (MYH) isoforms have been described from striated, smooth, and nonmuscle cells. These isoforms show expression that is spatially and temporally regulated during development. | NA |
| protein phosphatase 1 regulatory subunit 27 | 116729 | PPP1R27 | ENSG00000182676 | NA | NA |
| cerebral dopamine neurotrophic factor | 441549 | CDNF | ENSG00000185267 | NA | NA |
| myosin light chain 1 | 4632 | MYL1 | ENSG00000168530 | Myosin is a hexameric ATPase cellular motor protein. It is composed of two heavy chains, two nonphosphorylatable alkali light chains, and two phosphorylatable regulatory light chains. This gene encodes a myosin alkali light chain expressed in fast skeletal muscle. Two transcript variants have been identified for this gene. | NA |
| NA | ENSG00000260500 | CTD-3193O13.1 | ENSG00000260500 | NA | NA |
| transforming growth factor beta receptor 3 like | 100507588 | TGFBR3L | ENSG00000260001 | NA | NA |
| ADAM metallopeptidase domain 19 | 8728 | ADAM19 | ENSG00000135074 | This gene encodes a member of the ADAM (a disintegrin and metalloprotease domain) family. Members of this family are membrane-anchored proteins structurally related to snake venom disintegrins and have been implicated in a variety of biological processes involving cell-cell and cell-matrix interactions, including fertilization, muscle development, and neurogenesis. This member is a type I transmembrane protein and serves as a marker for dendritic cell differentiation. It has been demonstrated to be an active metalloproteinase, which may be involved in normal physiological processes such as cell migration, cell adhesion, cell-cell and cell-matrix interactions, and signal transduction. It is proposed to play a role in pathological processes, such as cancer, inflammatory diseases, renal diseases, and Alzheimer’s disease. | NA |
| myosin binding protein C, fast type | 4606 | MYBPC2 | ENSG00000086967 | This gene encodes a member of the myosin-binding protein C family. This family includes the fast-, slow- and cardiac-type isoforms, each of which is a myosin-associated protein found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The protein encoded by this locus is referred to as the fast-type isoform. Mutations in the related but distinct genes encoding the slow-type and cardiac-type isoforms have been associated with distal arthrogryposis, type 1 and hypertrophic cardiomyopathy, respectively. | NA |
| ATPase sarcoplasmic/endoplasmic reticulum Ca2+ transporting 1 | 487 | ATP2A1 | ENSG00000196296 | This gene encodes one of the SERCA Ca(2+)-ATPases, which are intracellular pumps located in the sarcoplasmic or endoplasmic reticula of muscle cells. This enzyme catalyzes the hydrolysis of ATP coupled with the translocation of calcium from the cytosol to the sarcoplasmic reticulum lumen, and is involved in muscular excitation and contraction. Mutations in this gene cause some autosomal recessive forms of Brody disease, characterized by increasing impairment of muscular relaxation during exercise. Alternative splicing results in three transcript variants encoding different isoforms. | NA |
| family with sequence similarity 83 member D | 81610 | FAM83D | ENSG00000101447 | NA | NA |
| CA3 antisense RNA 1 | 100996348 | CA3-AS1 | ENSG00000253549 | NA | NA |
| uncharacterized LOC100507537 | 100507537 | LOC100507537 | ENSG00000240045 | NA | NA |
| carbonic anhydrase 3 | 761 | CA3 | ENSG00000164879 | Carbonic anhydrase III (CAIII) is a member of a multigene family (at least six separate genes are known) that encodes carbonic anhydrase isozymes. These carbonic anhydrases are a class of metalloenzymes that catalyze the reversible hydration of carbon dioxide and are differentially expressed in a number of cell types. The expression of the CA3 gene is strictly tissue specific and present at high levels in skeletal muscle and much lower levels in cardiac and smooth muscle. A proportion of carriers of Duchenne muscle dystrophy have a higher CA3 level than normal. The gene spans 10.3 kb and contains seven exons and six introns. | NA |
| nebulin | 4703 | NEB | ENSG00000183091 | This gene encodes nebulin, a giant protein component of the cytoskeletal matrix that coexists with the thick and thin filaments within the sarcomeres of skeletal muscle. In most vertebrates, nebulin accounts for 3 to 4% of the total myofibrillar protein. The encoded protein contains approximately 30-amino acid long modules that can be classified into 7 types and other repeated modules. Protein isoform sizes vary from 600 to 800 kD due to alternative splicing that is tissue-, species-,and developmental stage-specific. Of the 183 exons in the nebulin gene, at least 43 are alternatively spliced, although exons 143 and 144 are not found in the same transcript. Of the several thousand transcript variants predicted for nebulin, the RefSeq Project has decided to create three representative RefSeq records. Mutations in this gene are associated with recessive nemaline myopathy. | NA |
| calcium voltage-gated channel auxiliary subunit beta 2 | 783 | CACNB2 | ENSG00000165995 | This gene encodes a subunit of a voltage-dependent calcium channel protein that is a member of the voltage-gated calcium channel superfamily. The gene product was originally identified as an antigen target in Lambert-Eaton myasthenic syndrome, an autoimmune disorder. Mutations in this gene are associated with Brugada syndrome. Alternatively spliced variants encoding different isoforms have been described. | NA |
| troponin C2, fast skeletal type | 7125 | TNNC2 | ENSG00000101470 | Troponin (Tn), a key protein complex in the regulation of striated muscle contraction, is composed of 3 subunits. The Tn-I subunit inhibits actomyosin ATPase, the Tn-T subunit binds tropomyosin and Tn-C, while the Tn-C subunit binds calcium and overcomes the inhibitory action of the troponin complex on actin filaments. The protein encoded by this gene is the Tn-C subunit. | NA |
| myosin binding protein C, slow type | 4604 | MYBPC1 | ENSG00000196091 | This gene encodes a member of the myosin-binding protein C family. Myosin-binding protein C family members are myosin-associated proteins found in the cross-bridge-bearing zone (C region) of A bands in striated muscle. The encoded protein is the slow skeletal muscle isoform of myosin-binding protein C and plays an important role in muscle contraction by recruiting muscle-type creatine kinase to myosin filaments. Mutations in this gene are associated with distal arthrogryposis type I. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| actinin alpha 1 | 87 | ACTN1 | ENSG00000072110 | Alpha actinins belong to the spectrin gene superfamily which represents a diverse group of cytoskeletal proteins, including the alpha and beta spectrins and dystrophins. Alpha actinin is an actin-binding protein with multiple roles in different cell types. In nonmuscle cells, the cytoskeletal isoform is found along microfilament bundles and adherens-type junctions, where it is involved in binding actin to the membrane. In contrast, skeletal, cardiac, and smooth muscle isoforms are localized to the Z-disc and analogous dense bodies, where they help anchor the myofibrillar actin filaments. This gene encodes a nonmuscle, cytoskeletal, alpha actinin isoform and maps to the same site as the structurally similar erythroid beta spectrin gene. Three transcript variants encoding different isoforms have been found for this gene. | NA |
| ATPase plasma membrane Ca2+ transporting 4 | 493 | ATP2B4 | ENSG00000058668 | The protein encoded by this gene belongs to the family of P-type primary ion transport ATPases characterized by the formation of an aspartyl phosphate intermediate during the reaction cycle. These enzymes remove bivalent calcium ions from eukaryotic cells against very large concentration gradients and play a critical role in intracellular calcium homeostasis. The mammalian plasma membrane calcium ATPase isoforms are encoded by at least four separate genes and the diversity of these enzymes is further increased by alternative splicing of transcripts. The expression of different isoforms and splice variants is regulated in a developmental, tissue- and cell type-specific manner, suggesting that these pumps are functionally adapted to the physiological needs of particular cells and tissues. This gene encodes the plasma membrane calcium ATPase isoform 4. Alternatively spliced transcript variants encoding different isoforms have been identified. | NA |
| troponin I1, slow skeletal type | 7135 | TNNI1 | ENSG00000159173 | Troponin proteins associate with tropomyosin and regulate the calcium sensitivity of the myofibril contractile apparatus of striated muscles. Troponin I (TnI), along with troponin T (TnT) and troponin C (TnC), is one of 3 subunits that form the troponin complex of the thin filaments of striated muscle. TnI is the inhibitory subunit; blocking actin-myosin interactions and thereby mediating striated muscle relaxation. The TnI subfamily contains three genes: TnI-skeletal-fast-twitch, TnI-skeletal-slow-twitch, and TnI-cardiac. The TnI-fast and TnI-slow genes are expressed in fast-twitch and slow-twitch skeletal muscle fibers, respectively, while the TnI-cardiac gene is expressed exclusively in cardiac muscle tissue. This gene encodes the Troponin-I-skeletal-slow-twitch protein. This gene is expressed in cardiac and skeletal muscle during early development but is restricted to slow-twitch skeletal muscle fibers in adults. The encoded protein prevents muscle contraction by inhibiting calcium-mediated conformational changes in actin-myosin complexes. | NA |
| SH3 and cysteine rich domain 3 | 246329 | STAC3 | ENSG00000185482 | The protein encoded by this gene is a component of the excitation-contraction coupling machinery of muscles. This protein is a member of the Stac gene family and contains an N-terminal cysteine-rich domain and two SH3 domains. Mutations in this gene are a cause of Native American myopathy. | NA |
| myosin, heavy chain 2, skeletal muscle, adult | 4620 | MYH2 | ENSG00000125414 | Myosins are actin-based motor proteins that function in the generation of mechanical force in eukaryotic cells. Muscle myosins are heterohexamers composed of 2 myosin heavy chains and 2 pairs of nonidentical myosin light chains. This gene encodes a member of the class II or conventional myosin heavy chains, and functions in skeletal muscle contraction. This gene is found in a cluster of myosin heavy chain genes on chromosome 17. A mutation in this gene results in inclusion body myopathy-3. Multiple alternatively spliced variants, encoding the same protein, have been identified. | NA |
| G protein-coupled receptor 162 | 27239 | GPR162 | ENSG00000250510 | This gene was identified upon genomic analysis of a gene-dense region at human chromosome 12p13. It appears to be mainly expressed in the brain; however, its function is not known. Alternatively spliced transcript variants encoding different isoforms have been identified. | NA |
| NA | ENSG00000257261 | RP11-96H19.1 | ENSG00000257261 | NA | NA |
| long intergenic non-protein coding RNA 1372 | 101929736 | LINC01372 | ENSG00000235475 | NA | NA |
| SPARC related modular calcium binding 1 | 64093 | SMOC1 | ENSG00000198732 | This gene encodes a multi-domain secreted protein that may have a critical role in ocular and limb development. Mutations in this gene are associated with microphthalmia and limb anomalies. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| keratin 8 | 3856 | KRT8 | ENSG00000170421 | This gene is a member of the type II keratin family clustered on the long arm of chromosome 12. Type I and type II keratins heteropolymerize to form intermediate-sized filaments in the cytoplasm of epithelial cells. The product of this gene typically dimerizes with keratin 18 to form an intermediate filament in simple single-layered epithelial cells. This protein plays a role in maintaining cellular structural integrity and also functions in signal transduction and cellular differentiation. Mutations in this gene cause cryptogenic cirrhosis. Alternatively spliced transcript variants have been found for this gene. | NA |
| family with sequence similarity 46 member B | 115572 | FAM46B | ENSG00000158246 | NA | NA |
| NA | ENSG00000232220 | AC008440.5 | ENSG00000232220 | NA | NA |
| myeloid-associated differentiation marker | 91663 | MYADM | ENSG00000179820 | NA | NA |
| cold shock domain containing C2 | 27254 | CSDC2 | ENSG00000172346 | NA | NA |
| aldolase, fructose-bisphosphate C | 230 | ALDOC | ENSG00000109107 | This gene encodes a member of the class I fructose-biphosphate aldolase gene family. Expressed specifically in the hippocampus and Purkinje cells of the brain, the encoded protein is a glycolytic enzyme that catalyzes the reversible aldol cleavage of fructose-1,6-biphosphate and fructose 1-phosphate to dihydroxyacetone phosphate and either glyceraldehyde-3-phosphate or glyceraldehyde, respectively. | NA |
| smoothelin | 6525 | SMTN | ENSG00000183963 | This gene encodes a structural protein that is found exclusively in contractile smooth muscle cells. It associates with stress fibers and constitutes part of the cytoskeleton. This gene is localized to chromosome 22q12.3, distal to the TUPLE1 locus and outside the DiGeorge syndrome deletion. Alternative splicing of this gene results in multiple transcript variants encoding distinct isoforms. | NA |
| ryanodine receptor 1 | 6261 | RYR1 | ENSG00000196218 | This gene encodes a ryanodine receptor found in skeletal muscle. The encoded protein functions as a calcium release channel in the sarcoplasmic reticulum but also serves to connect the sarcoplasmic reticulum and transverse tubule. Mutations in this gene are associated with malignant hyperthermia susceptibility, central core disease, and minicore myopathy with external ophthalmoplegia. Alternatively spliced transcripts encoding different isoforms have been described. | NA |
| calmodulin like 6 | 163688 | CALML6 | ENSG00000169885 | NA | NA |
| zyxin | 7791 | ZYX | ENSG00000159840 | Focal adhesions are actin-rich structures that enable cells to adhere to the extracellular matrix and at which protein complexes involved in signal transduction assemble. Zyxin is a zinc-binding phosphoprotein that concentrates at focal adhesions and along the actin cytoskeleton. Zyxin has an N-terminal proline-rich domain and three LIM domains in its C-terminal half. The proline-rich domain may interact with SH3 domains of proteins involved in signal transduction pathways while the LIM domains are likely involved in protein-protein binding. Zyxin may function as a messenger in the signal transduction pathway that mediates adhesion-stimulated changes in gene expression and may modulate the cytoskeletal organization of actin bundles. Alternative splicing results in multiple transcript variants that encode the same isoform. | NA |
| CACNA1C antisense RNA 2 | 100874235 | CACNA1C-AS2 | ENSG00000256271 | NA | NA |
| prostaglandin F2 receptor inhibitor | 5738 | PTGFRN | ENSG00000134247 | NA | NA |
| KIAA1217 | 56243 | KIAA1217 | ENSG00000120549 | NA | NA |
| phospholipase A2 group V | 5322 | PLA2G5 | ENSG00000127472 | This gene is a member of the secretory phospholipase A2 family. It is located in a tightly-linked cluster of secretory phospholipase A2 genes on chromosome 1. The encoded enzyme catalyzes the hydrolysis of membrane phospholipids to generate lysophospholipids and free fatty acids including arachidonic acid. It preferentially hydrolyzes linoleoyl-containing phosphatidylcholine substrates. Secretion of this enzyme is thought to induce inflammatory responses in neighboring cells. Alternatively spliced transcript variants have been found, but their full-length nature has not been determined. | NA |
| apolipoprotein D | 347 | APOD | ENSG00000189058 | This gene encodes a component of high density lipoprotein that has no marked similarity to other apolipoprotein sequences. It has a high degree of homology to plasma retinol-binding protein and other members of the alpha 2 microglobulin protein superfamily of carrier proteins, also known as lipocalins. This glycoprotein is closely associated with the enzyme lecithin:cholesterol acyltransferase - an enzyme involved in lipoprotein metabolism. | NA |
| ST3GAL5 antisense RNA 1 (head to head) | ENSG00000232504 | ST3GAL5-AS1 | ENSG00000232504 | NA | NA |
| dishevelled binding antagonist of beta catenin 3 | 147906 | DACT3 | ENSG00000197380 | NA | NA |
| peptidyl arginine deiminase 2 | 11240 | PADI2 | ENSG00000117115 | This gene encodes a member of the peptidyl arginine deiminase family of enzymes, which catalyze the post-translational deimination of proteins by converting arginine residues into citrullines in the presence of calcium ions. The family members have distinct substrate specificities and tissue-specific expression patterns. The type II enzyme is the most widely expressed family member. Known substrates for this enzyme include myelin basic protein in the central nervous system and vimentin in skeletal muscle and macrophages. This enzyme is thought to play a role in the onset and progression of neurodegenerative human disorders, including Alzheimer disease and multiple sclerosis, and it has also been implicated in glaucoma pathogenesis. This gene exists in a cluster with four other paralogous genes. | NA |
| phosphorylase, glycogen; brain | 5834 | PYGB | ENSG00000100994 | The protein encoded by this gene is a glycogen phosphorylase found predominantly in the brain. The encoded protein forms homodimers which can associate into homotetramers, the enzymatically active form of glycogen phosphorylase. The activity of this enzyme is positively regulated by AMP and negatively regulated by ATP, ADP, and glucose-6-phosphate. This enzyme catalyzes the rate-determining step in glycogen degradation. | NA |
| protein phosphatase 1 regulatory subunit 12C | 54776 | PPP1R12C | ENSG00000125503 | The gene encodes a subunit of myosin phosphatase. The encoded protein regulates the catalytic activity of protein phosphatase 1 delta and assembly of the actin cytoskeleton. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. | NA |
| myozenin 1 | 58529 | MYOZ1 | ENSG00000177791 | The protein encoded by this gene is primarily expressed in the skeletal muscle, and belongs to the myozenin family. Members of this family function as calcineurin-interacting proteins that help tether calcineurin to the sarcomere of cardiac and skeletal muscle. They play an important role in modulation of calcineurin signaling. | NA |
| synaptogyrin 3 | 9143 | SYNGR3 | ENSG00000127561 | This gene encodes an integral membrane protein. The exact function of this protein is unclear, but studies of a similar murine protein suggest that it is a synaptic vesicle protein that also interacts with the dopamine transporter. The gene product belongs to the synaptogyrin gene family. | NA |
| mitogen-activated protein kinase kinase 6 | 5608 | MAP2K6 | ENSG00000108984 | This gene encodes a member of the dual specificity protein kinase family, which functions as a mitogen-activated protein (MAP) kinase kinase. MAP kinases, also known as extracellular signal-regulated kinases (ERKs), act as an integration point for multiple biochemical signals. This protein phosphorylates and activates p38 MAP kinase in response to inflammatory cytokines or environmental stress. As an essential component of p38 MAP kinase mediated signal transduction pathway, this gene is involved in many cellular processes such as stress induced cell cycle arrest, transcription activation and apoptosis. | NA |
| chromosome 8 open reading frame 88 | 100127983 | C8orf88 | ENSG00000253250 | NA | NA |
| NA | ENSG00000249863 | RP11-177C12.1 | ENSG00000249863 | NA | NA |
| prostaglandin-endoperoxide synthase 1 | 5742 | PTGS1 | ENSG00000095303 | This is one of two genes encoding similar enzymes that catalyze the conversion of arachinodate to prostaglandin. The encoded protein regulates angiogenesis in endothelial cells, and is inhibited by nonsteroidal anti-inflammatory drugs such as aspirin. Based on its ability to function as both a cyclooxygenase and as a peroxidase, the encoded protein has been identified as a moonlighting protein. The protein may promote cell proliferation during tumor progression. Alternative splicing results in multiple transcript variants. | NA |
| NA | NA | NA | ENSG00000259716 | NA | TRUE |
| NA | ENSG00000268707 | RP11-247A12.7 | ENSG00000268707 | NA | NA |
| sperm-tail PG-rich repeat containing 3 | 441476 | STPG3 | ENSG00000197768 | NA | NA |
| glutamic pyruvate transaminase (alanine aminotransferase) 2 | 84706 | GPT2 | ENSG00000166123 | This gene encodes a mitochondrial alanine transaminase, a pyridoxal enzyme that catalyzes the reversible transamination between alanine and 2-oxoglutarate to generate pyruvate and glutamate. Alanine transaminases play roles in gluconeogenesis and amino acid metabolism in many tissues including skeletal muscle, kidney, and liver. Activating transcription factor 4 upregulates this gene under metabolic stress conditions in hepatocyte cell lines. A loss of function mutation in this gene has been associated with developmental encephalopathy. Alternative splicing results in multiple transcript variants. | NA |
| PRKAR2A antisense RNA 1 | 100506637 | PRKAR2A-AS1 | ENSG00000224424 | NA | NA |
| cadherin 2 | 1000 | CDH2 | ENSG00000170558 | This gene encodes a classical cadherin and member of the cadherin superfamily. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein is proteolytically processed to generate a calcium-dependent cell adhesion molecule and glycoprotein. This protein plays a role in the establishment of left-right asymmetry, development of the nervous system and the formation of cartilage and bone. | NA |
| cell cycle exit and neuronal differentiation 1 | 51286 | CEND1 | ENSG00000184524 | The protein encoded by this gene is a neuron-specific protein. The similar protein in pig enhances neuroblastoma cell differentiation in vitro and may be involved in neuronal differentiation in vivo. Multiple pseudogenes have been reported for this gene. | NA |
| sterile alpha motif domain containing 13 | 148418 | SAMD13 | ENSG00000203943 | NA | NA |
| vitronectin | 7448 | VTN | ENSG00000109072 | The protein encoded by this gene is a member of the pexin family. It is found in serum and tissues and promotes cell adhesion and spreading, inhibits the membrane-damaging effect of the terminal cytolytic complement pathway, and binds to several serpin serine protease inhibitors. It is a secreted protein and exists in either a single chain form or a clipped, two chain form held together by a disulfide bond. | NA |
| myosin light chain 9 | 10398 | MYL9 | ENSG00000101335 | Myosin, a structural component of muscle, consists of two heavy chains and four light chains. The protein encoded by this gene is a myosin light chain that may regulate muscle contraction by modulating the ATPase activity of myosin heads. The encoded protein binds calcium and is activated by myosin light chain kinase. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| SRRM2 antisense RNA 1 | 100128788 | SRRM2-AS1 | ENSG00000205913 | NA | NA |
| NA | ENSG00000272735 | RP11-467P9.1 | ENSG00000272735 | NA | NA |
| solute carrier family 16 member 9 | 220963 | SLC16A9 | ENSG00000165449 | NA | NA |
| phosphatidylinositol-4-phosphate 5-kinase type 1 gamma | 23396 | PIP5K1C | ENSG00000186111 | This locus encodes a type I phosphatidylinositol 4-phosphate 5-kinase. The encoded protein catalyzes phosphorylation of phosphatidylinositol 4-phosphate, producing phosphatidylinositol 4,5-bisphosphate. This enzyme is found at synapses and has been found to play roles in endocytosis and cell migration. Mutations at this locus have been associated with lethal congenital contractural syndrome. Alternatively spliced transcript variants encoding different isoforms have been described. | NA |
| NA | ENSG00000217648 | RP1-95L4.4 | ENSG00000217648 | NA | NA |
| ubiquitin specific peptidase 6 | 9098 | USP6 | ENSG00000129204 | NA | NA |
| ATPase phospholipid transporting 8A1 | 10396 | ATP8A1 | ENSG00000124406 | The P-type adenosinetriphosphatases (P-type ATPases) are a family of proteins which use the free energy of ATP hydrolysis to drive uphill transport of ions across membranes. Several subfamilies of P-type ATPases have been identified. One subfamily catalyzes transport of heavy metal ions. Another subfamily transports non-heavy metal ions (NMHI). The protein encoded by this gene is a member of the third subfamily of P-type ATPases and acts to transport amphipaths, such as phosphatidylserine. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| vasodilator-stimulated phosphoprotein | 7408 | VASP | ENSG00000125753 | Vasodilator-stimulated phosphoprotein (VASP) is a member of the Ena-VASP protein family. Ena-VASP family members contain an EHV1 N-terminal domain that binds proteins containing E/DFPPPPXD/E motifs and targets Ena-VASP proteins to focal adhesions. In the mid-region of the protein, family members have a proline-rich domain that binds SH3 and WW domain-containing proteins. Their C-terminal EVH2 domain mediates tetramerization and binds both G and F actin. VASP is associated with filamentous actin formation and likely plays a widespread role in cell adhesion and motility. VASP may also be involved in the intracellular signaling pathways that regulate integrin-extracellular matrix interactions. VASP is regulated by the cyclic nucleotide-dependent kinases PKA and PKG. | NA |
| paired related homeobox 1 | 5396 | PRRX1 | ENSG00000116132 | The DNA-associated protein encoded by this gene is a member of the paired family of homeobox proteins localized to the nucleus. The protein functions as a transcription co-activator, enhancing the DNA-binding activity of serum response factor, a protein required for the induction of genes by growth and differentiation factors. The protein regulates muscle creatine kinase, indicating a role in the establishment of diverse mesodermal muscle types. Alternative splicing yields two isoforms that differ in abundance and expression patterns. | NA |
| actin, alpha, cardiac muscle 1 | 70 | ACTC1 | ENSG00000159251 | Actins are highly conserved proteins that are involved in various types of cell motility. Polymerization of globular actin (G-actin) leads to a structural filament (F-actin) in the form of a two-stranded helix. Each actin can bind to four others. The protein encoded by this gene belongs to the actin family which is comprised of three main groups of actin isoforms, alpha, beta, and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. Defects in this gene have been associated with idiopathic dilated cardiomyopathy (IDC) and familial hypertrophic cardiomyopathy (FHC). | NA |
| PPARG coactivator 1 beta | 133522 | PPARGC1B | ENSG00000155846 | The protein encoded by this gene stimulates the activity of several transcription factors and nuclear receptors, including estrogen receptor alpha, nuclear respiratory factor 1, and glucocorticoid receptor. The encoded protein may be involved in fat oxidation, non-oxidative glucose metabolism, and the regulation of energy expenditure. This protein is downregulated in prediabetic and type 2 diabetes mellitus patients. Certain allelic variations in this gene increase the risk of the development of obesity. Three transcript variants encoding different isoforms have been found for this gene. | NA |
| protein kinase cAMP-dependent type I regulatory subunit beta | 5575 | PRKAR1B | ENSG00000188191 | The protein encoded by this gene is a regulatory subunit of cyclic AMP-dependent protein kinase A (PKA), which is involved in the signaling pathway of the second messenger cAMP. Two regulatory and two catalytic subunits form the PKA holoenzyme, disbands after cAMP binding. The holoenzyme is involved in many cellular events, including ion transport, metabolism, and transcription. Several transcript variants encoding the same protein have been found for this gene. | NA |
| serum response factor | 6722 | SRF | ENSG00000112658 | This gene encodes a ubiquitous nuclear protein that stimulates both cell proliferation and differentiation. It is a member of the MADS (MCM1, Agamous, Deficiens, and SRF) box superfamily of transcription factors. This protein binds to the serum response element (SRE) in the promoter region of target genes. This protein regulates the activity of many immediate-early genes, for example c-fos, and thereby participates in cell cycle regulation, apoptosis, cell growth, and cell differentiation. This gene is the downstream target of many pathways; for example, the mitogen-activated protein kinase pathway (MAPK) that acts through the ternary complex factors (TCFs). Two transcript variants encoding different isoforms have been found for this gene. | NA |
| butyrylcholinesterase | 590 | BCHE | ENSG00000114200 | Mutant alleles at the BCHE locus are responsible for suxamethonium sensitivity. Homozygous persons sustain prolonged apnea after administration of the muscle relaxant suxamethonium in connection with surgical anesthesia. The activity of pseudocholinesterase in the serum is low and its substrate behavior is atypical. In the absence of the relaxant, the homozygote is at no known disadvantage. | NA |
| NA | ENSG00000186076 | RP11-887P2.3 | ENSG00000186076 | NA | NA |
| long intergenic non-protein coding RNA 1135 | ENSG00000234807 | LINC01135 | ENSG00000234807 | NA | NA |
| epithelial membrane protein 3 | 2014 | EMP3 | ENSG00000142227 | The protein encoded by this gene belongs to the PMP-22/EMP/MP20 family of proteins. The protein contains four transmembrane domains and two N-linked glycosylation sites. It is thought to be involved in cell proliferation, cell-cell interactions and function as a tumor suppressor. Alternative splicing results in multiple transcript variants. | NA |
| adenylate kinase 4 | 205 | AK4 | ENSG00000162433 | This gene encodes a member of the adenylate kinase family of enzymes. The encoded protein is localized to the mitochondrial matrix. Adenylate kinases regulate the adenine and guanine nucleotide compositions within a cell by catalyzing the reversible transfer of phosphate group among these nucleotides. Five isozymes of adenylate kinase have been identified in vertebrates. Expression of these isozymes is tissue-specific and developmentally regulated. A pseudogene for this gene has been located on chromosome 17. Three transcript variants encoding the same protein have been identified for this gene. Sequence alignment suggests that the gene defined by NM_013410, NM_203464, and NM_001005353 is located on chromosome 1. | NA |
| family with sequence similarity 175 member A | 84142 | FAM175A | ENSG00000163322 | NA | NA |
| ankyrin repeat domain 23 | 200539 | ANKRD23 | ENSG00000163126 | This gene is a member of the muscle ankyrin repeat protein (MARP) family and encodes a protein with four tandem ankyrin-like repeats. The protein is localized to the nucleus, functioning as a transcriptional regulator. Expression of this protein is induced during recovery following starvation. | NA |
| uveal autoantigen with coiled-coil domains and ankyrin repeats | 55075 | UACA | ENSG00000137831 | NA | NA |
| phosphoglycerate mutase 1 | 5223 | PGAM1 | ENSG00000171314 | The protein encoded by this gene is a mutase that catalyzes the reversible reaction of 3-phosphoglycerate (3-PGA) to 2-phosphoglycerate (2-PGA) in the glycolytic pathway. Two transcript variants encoding different isoforms have been found for this gene. | NA |
| TIPARP antisense RNA 1 | ENSG00000243926 | TIPARP-AS1 | ENSG00000243926 | NA | NA |
| transcription factor CP2-like 1 | 29842 | TFCP2L1 | ENSG00000115112 | NA | NA |
| cilia and flagella associated protein 53 | 220136 | CFAP53 | ENSG00000172361 | This gene belongs to the CFAP53 family. It was found to be differentially expressed by the ciliated cells of frog epidermis and in skin fibroblasts from human. Mutations in this gene are associated with visceral heterotaxy-6, which implicates this gene in determination of left-right asymmetric patterning. | NA |
| AF4/FMR2 family member 3 | 3899 | AFF3 | ENSG00000144218 | This gene encodes a tissue-restricted nuclear transcriptional activator that is preferentially expressed in lymphoid tissue. Isolation of this protein initially defined a highly conserved LAF4/MLLT2 gene family of nuclear transcription factors that may function in lymphoid development and oncogenesis. In some ALL patients, this gene has been found fused to the gene for MLL. Multiple alternatively spliced transcript variants that encode different proteins have been found for this gene. | NA |
| NA | ENSG00000245864 | CTC-467M3.1 | ENSG00000245864 | NA | NA |
| myosin light chain 7 | 58498 | MYL7 | ENSG00000106631 | NA | NA |
| transmembrane protein 52 | 339456 | TMEM52 | ENSG00000178821 | NA | NA |
| sperm acrosome associated 6 | 147650 | SPACA6 | ENSG00000182310 | NA | NA |
| transmembrane protein 240 | 339453 | TMEM240 | ENSG00000205090 | This gene encodes a transmembrane-domain containing protein found in the brain and cerebellum. Mutations in this gene result in spinocerebellar ataxia 21. | NA |
| tetratricopeptide repeat and ankyrin repeat containing 1 | 9881 | TRANK1 | ENSG00000168016 | NA | NA |
| Purkinje cell protein 4 like 1 | 654790 | PCP4L1 | ENSG00000248485 | NA | NA |
| NA | ENSG00000265168 | RP11-192H23.5 | ENSG00000265168 | NA | NA |
| regulator of G-protein signaling 2 | 5997 | RGS2 | ENSG00000116741 | Regulator of G protein signaling (RGS) family members are regulatory molecules that act as GTPase activating proteins (GAPs) for G alpha subunits of heterotrimeric G proteins. RGS proteins are able to deactivate G protein subunits of the Gi alpha, Go alpha and Gq alpha subtypes. They drive G proteins into their inactive GDP-bound forms. Regulator of G protein signaling 2 belongs to this family. The protein acts as a mediator of myeloid differentiation and may play a role in leukemogenesis. | NA |
| NA | ENSG00000260572 | RP11-16N11.2 | ENSG00000260572 | NA | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",19,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);
out <- mygene::queryMany(gene_list[20,], scopes="ensembl.gene", fields=c("name", "summary", "symbol"), species="human");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
| symbol | X_id | query | name | summary | notfound |
|---|---|---|---|---|---|
| GTF2IP13 | ENSG00000272556 | ENSG00000272556 | general transcription factor IIi pseudogene 13 | NA | NA |
| FOSL1 | 8061 | ENSG00000175592 | FOS like 1, AP-1 transcription factor subunit | The Fos gene family consists of 4 members: FOS, FOSB, FOSL1, and FOSL2. These genes encode leucine zipper proteins that can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1. As such, the FOS proteins have been implicated as regulators of cell proliferation, differentiation, and transformation. Several transcript variants encoding different isoforms have been found for this gene. | NA |
| CCER2 | 643669 | ENSG00000262484 | coiled-coil glutamate rich protein 2 | NA | NA |
| CEBPA | 1050 | ENSG00000245848 | CCAAT/enhancer binding protein alpha | This intronless gene encodes a transcription factor that contains a basic leucine zipper (bZIP) domain and recognizes the CCAAT motif in the promoters of target genes. The encoded protein functions in homodimers and also heterodimers with CCAAT/enhancer-binding proteins beta and gamma. Activity of this protein can modulate the expression of genes involved in cell cycle regulation as well as in body weight homeostasis. Mutation of this gene is associated with acute myeloid leukemia. The use of alternative in-frame non-AUG (GUG) and AUG start codons results in protein isoforms with different lengths. Differential translation initiation is mediated by an out-of-frame, upstream open reading frame which is located between the GUG and the first AUG start codons. | NA |
| CTD-3025N20.3 | ENSG00000272010 | ENSG00000272010 | NA | NA | NA |
| AC017101.10 | ENSG00000227227 | ENSG00000227227 | NA | NA | NA |
| IPO7P2 | ENSG00000225674 | ENSG00000225674 | importin 7 pseudogene 2 | NA | NA |
| GLI1 | 2735 | ENSG00000111087 | GLI family zinc finger 1 | This gene encodes a member of the Kruppel family of zinc finger proteins. The encoded transcription factor is activated by the sonic hedgehog signal transduction cascade and regulates stem cell proliferation. The activity and nuclear localization of this protein is negatively regulated by p53 in an inhibitory loop. Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| NAMPTP1 | ENSG00000229644 | ENSG00000229644 | nicotinamide phosphoribosyltransferase pseudogene 1 | NA | NA |
| GPRC5A | 9052 | ENSG00000013588 | G protein-coupled receptor class C group 5 member A | This gene encodes a member of the type 3 G protein-coupling receptor family, characterized by the signature 7-transmembrane domain motif. The encoded protein may be involved in interaction between retinoid acid and G protein signalling pathways. Retinoic acid plays a critical role in development, cellular growth, and differentiation. This gene may play a role in embryonic development and epithelial cell differentiation. | NA |
| NR4A3 | 8013 | ENSG00000119508 | nuclear receptor subfamily 4 group A member 3 | This gene encodes a member of the steroid-thyroid hormone-retinoid receptor superfamily. The encoded protein may act as a transcriptional activator. The protein can efficiently bind the NGFI-B Response Element (NBRE). Three different versions of extraskeletal myxoid chondrosarcomas (EMCs) are the result of reciprocal translocations between this gene and other genes. The translocation breakpoints are associated with Nuclear Receptor Subfamily 4, Group A, Member 3 (on chromosome 9) and either Ewing Sarcome Breakpoint Region 1 (on chromosome 22), RNA Polymerase II, TATA Box-Binding Protein-Associated Factor, 68-KD (on chromosome 17), or Transcription factor 12 (on chromosome 15). Multiple transcript variants encoding different isoforms have been found for this gene. | NA |
| CCDC150 | 284992 | ENSG00000144395 | coiled-coil domain containing 150 | NA | NA |
| CIDEC | 63924 | ENSG00000187288 | cell death inducing DFFA like effector c | This gene encodes a member of the cell death-inducing DNA fragmentation factor-like effector family. Members of this family play important roles in apoptosis. The encoded protein promotes lipid droplet formation in adipocytes and may mediate adipocyte apoptosis. This gene is regulated by insulin and its expression is positively correlated with insulin sensitivity. Mutations in this gene may contribute to insulin resistant diabetes. A pseudogene of this gene is located on the short arm of chromosome 3. Alternatively spliced transcript variants that encode different isoforms have been observed for this gene. | NA |
| FXYD1 | 5348 | ENSG00000266964 | FXYD domain containing ion transport regulator 1 | This gene encodes a member of a family of small membrane proteins that share a 35-amino acid signature sequence domain, beginning with the sequence PFXYD and containing 7 invariant and 6 highly conserved amino acids. The approved human gene nomenclature for the family is FXYD-domain containing ion transport regulator. Mouse FXYD5 has been termed RIC (Related to Ion Channel). FXYD2, also known as the gamma subunit of the Na,K-ATPase, regulates the properties of that enzyme. FXYD1 (phospholemman), FXYD2 (gamma), FXYD3 (MAT-8), FXYD4 (CHIF), and FXYD5 (RIC) have been shown to induce channel activity in experimental expression systems. Transmembrane topology has been established for two family members (FXYD1 and FXYD2), with the N-terminus extracellular and the C-terminus on the cytoplasmic side of the membrane. The protein encoded by this gene is a plasma membrane substrate for several kinases, including protein kinase A, protein kinase C, NIMA kinase, and myotonic dystrophy kinase. It is thought to form an ion channel or regulate ion channel activity. Transcript variants with different 5’ UTR sequences have been described in the literature. | NA |
| CTD-2527I21.4 | ENSG00000221857 | ENSG00000221857 | NA | NA | NA |
| CYP1A1 | 1543 | ENSG00000140465 | cytochrome P450 family 1 subfamily A member 1 | This gene, CYP1A1, encodes a member of the cytochrome P450 superfamily of enzymes. The cytochrome P450 proteins are monooxygenases which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. This protein localizes to the endoplasmic reticulum and its expression is induced by some polycyclic aromatic hydrocarbons (PAHs), some of which are found in cigarette smoke. The enzyme’s endogenous substrate is unknown; however, it is able to metabolize some PAHs to carcinogenic intermediates. The gene has been associated with lung cancer risk. A related family member, CYP1A2, is located approximately 25 kb away from CYP1A1 on chromosome 15. Alternative splicing results in multiple transcript variants encoding distinct isoforms. | NA |
| SRXN1 | 140809 | ENSG00000271303 | sulfiredoxin 1 | NA | NA |
| ACTG1P17 | 283693 | ENSG00000259315 | actin gamma 1 pseudogene 17 | NA | NA |
| KRTAP5-9 | 3846 | ENSG00000254997 | keratin associated protein 5-9 | NA | NA |
| CTD-2517M22.14 | ENSG00000255182 | ENSG00000255182 | NA | NA | NA |
| ZNF770 | 54989 | ENSG00000198146 | zinc finger protein 770 | NA | NA |
| RP11-618G20.1 | ENSG00000258964 | ENSG00000258964 | NA | NA | NA |
| VLDLR-AS1 | 401491 | ENSG00000236404 | VLDLR antisense RNA 1 | NA | NA |
| GPT | 2875 | ENSG00000167701 | glutamic-pyruvate transaminase (alanine aminotransferase) | This gene encodes cytosolic alanine aminotransaminase 1 (ALT1); also known as glutamate-pyruvate transaminase 1. This enzyme catalyzes the reversible transamination between alanine and 2-oxoglutarate to generate pyruvate and glutamate and, therefore, plays a key role in the intermediary metabolism of glucose and amino acids. Serum activity levels of this enzyme are routinely used as a biomarker of liver injury caused by drug toxicity, infection, alcohol, and steatosis. A related gene on chromosome 16 encodes a putative mitochondrial alanine aminotransaminase. | NA |
| UBE2V1P2 | ENSG00000214192 | ENSG00000214192 | ubiquitin conjugating enzyme E2 variant 1 pseudogene 2 | NA | NA |
| RP11-134K13.4 | ENSG00000271967 | ENSG00000271967 | NA | NA | NA |
| NPM1P39 | ENSG00000225159 | ENSG00000225159 | nucleophosmin 1 (nucleolar phosphoprotein B23, numatrin) pseudogene 39 | NA | NA |
| RP11-1096G20.5 | ENSG00000266368 | ENSG00000266368 | NA | NA | NA |
| RARRES1 | 5918 | ENSG00000118849 | retinoic acid receptor responder 1 | This gene was identified as a retinoid acid (RA) receptor-responsive gene. It encodes a type 1 membrane protein. The expression of this gene is upregulated by tazarotene as well as by retinoic acid receptors. The expression of this gene is found to be downregulated in prostate cancer, which is caused by the methylation of its promoter and CpG island. Alternatively spliced transcript variant encoding distinct isoforms have been observed. | NA |
| RP11-130L8.2 | ENSG00000269976 | ENSG00000269976 | NA | NA | NA |
| IFFO2 | 126917 | ENSG00000169991 | intermediate filament family orphan 2 | NA | NA |
| MTPN | 136319 | ENSG00000105887 | myotrophin | The transcript produced from this gene is bi-cistronic and can encode both myotrophin and leucine zipper protein 6. The myotrophin protein is associated with cardiac hypertrophy, where it is involved in the conversion of NFkappa B p50-p65 heterodimers to p50-p50 and p65-p65 homodimers. This protein also has a potential function in cerebellar morphogenesis, and it may be involved in the differentiation of cerebellar neurons, particularly of granule cells. A cryptic ORF at the 3’ end of this transcript uses a novel internal ribosome entry site and a non-AUG translation initiation codon to produce leucine zipper protein 6, a 6.4 kDa tumor antigen that is associated with myeloproliferative disease. | NA |
| RP11-299M14.2 | ENSG00000255343 | ENSG00000255343 | NA | NA | NA |
| NA | NA | ENSG00000273075 | NA | NA | TRUE |
| SLC35E1 | 79939 | ENSG00000127526 | solute carrier family 35 member E1 | NA | NA |
| FAM229A | 100128071 | ENSG00000225828 | family with sequence similarity 229 member A | NA | NA |
| SMARCA5-AS1 | ENSG00000245112 | ENSG00000245112 | SMARCA5 antisense RNA 1 | NA | NA |
| HIF1A | 3091 | ENSG00000100644 | hypoxia inducible factor 1 alpha subunit | This gene encodes the alpha subunit of transcription factor hypoxia-inducible factor-1 (HIF-1), which is a heterodimer composed of an alpha and a beta subunit. HIF-1 functions as a master regulator of cellular and systemic homeostatic response to hypoxia by activating transcription of many genes, including those involved in energy metabolism, angiogenesis, apoptosis, and other genes whose protein products increase oxygen delivery or facilitate metabolic adaptation to hypoxia. HIF-1 thus plays an essential role in embryonic vascularization, tumor angiogenesis and pathophysiology of ischemic disease. Alternatively spliced transcript variants encoding different isoforms have been identified for this gene. | NA |
| CAHM | 100526820 | ENSG00000270419 | colon adenocarcinoma hypermethylated (non-protein coding) | NA | NA |
| GSDMB | 55876 | ENSG00000073605 | gasdermin B | This gene encodes a member of the gasdermin-domain containing protein family. Other gasdermin-family genes are implicated in the regulation of apoptosis in epithelial cells, and are linked to cancer. Multiple transcript variants encoding different isoforms have been found for this gene. Additional variants have been described, but they are candidates for nonsense-mediated mRNA decay (NMD) and are unlikely to be protein-coding. | NA |
| TBX15 | 6913 | ENSG00000092607 | T-box 15 | This gene belongs to the T-box family of genes, which encode a phylogenetically conserved family of transcription factors that regulate a variety of developmental processes. All these genes contain a common T-box DNA-binding domain. Mutations in this gene are associated with Cousin syndrome. | NA |
| ARG2 | 384 | ENSG00000081181 | arginase 2 | Arginase catalyzes the hydrolysis of arginine to ornithine and urea. At least two isoforms of mammalian arginase exists (types I and II) which differ in their tissue distribution, subcellular localization, immunologic crossreactivity and physiologic function. The type II isoform encoded by this gene, is located in the mitochondria and expressed in extra-hepatic tissues, especially kidney. The physiologic role of this isoform is poorly understood; it is thought to play a role in nitric oxide and polyamine metabolism. Transcript variants of the type II gene resulting from the use of alternative polyadenylation sites have been described. | NA |
| NPM1P6 | ENSG00000213881 | ENSG00000213881 | nucleophosmin 1 (nucleolar phosphoprotein B23, numatrin) pseudogene 6 | NA | NA |
| MIR3661 | 100500905 | ENSG00000266751 | microRNA 3661 | microRNAs (miRNAs) are short (20-24 nt) non-coding RNAs that are involved in post-transcriptional regulation of gene expression in multicellular organisms by affecting both the stability and translation of mRNAs. miRNAs are transcribed by RNA polymerase II as part of capped and polyadenylated primary transcripts (pri-miRNAs) that can be either protein-coding or non-coding. The primary transcript is cleaved by the Drosha ribonuclease III enzyme to produce an approximately 70-nt stem-loop precursor miRNA (pre-miRNA), which is further cleaved by the cytoplasmic Dicer ribonuclease to generate the mature miRNA and antisense miRNA star (miRNA*) products. The mature miRNA is incorporated into a RNA-induced silencing complex (RISC), which recognizes target mRNAs through imperfect base pairing with the miRNA and most commonly results in translational inhibition or destabilization of the target mRNA. The RefSeq represents the predicted microRNA stem-loop. | NA |
| RP1-117B12.4 | ENSG00000253102 | ENSG00000253102 | NA | NA | NA |
| AZGP1 | 563 | ENSG00000160862 | alpha-2-glycoprotein 1, zinc-binding | NA | NA |
| RP11-46F15.2 | ENSG00000238260 | ENSG00000238260 | NA | NA | NA |
| RP4-791M13.3 | ENSG00000254539 | ENSG00000254539 | NA | NA | NA |
| NAMPT | 10135 | ENSG00000105835 | nicotinamide phosphoribosyltransferase | This gene encodes a protein that catalyzes the condensation of nicotinamide with 5-phosphoribosyl-1-pyrophosphate to yield nicotinamide mononucleotide, one step in the biosynthesis of nicotinamide adenine dinucleotide. The protein belongs to the nicotinic acid phosphoribosyltransferase (NAPRTase) family and is thought to be involved in many important biological processes, including metabolism, stress response and aging. This gene has a pseudogene on chromosome 10. | NA |
| DDX21 | 9188 | ENSG00000165732 | DEAD-box helicase 21 | DEAD box proteins, characterized by the conserved motif Asp-Glu-Ala-Asp (DEAD), are putative RNA helicases. They are implicated in a number of cellular processes involving alteration of RNA secondary structure such as translation initiation, nuclear and mitochondrial splicing, and ribosome and spliceosome assembly. Based on their distribution patterns, some members of this family are believed to be involved in embryogenesis, spermatogenesis, and cellular growth and division. This gene encodes a DEAD box protein, which is an antigen recognized by autoimmune antibodies from a patient with watermelon stomach disease. This protein unwinds double-stranded RNA, folds single-stranded RNA, and may play important roles in ribosomal RNA biogenesis, RNA editing, RNA transport, and general transcription. | NA |
| RP11-457M11.5 | ENSG00000261584 | ENSG00000261584 | NA | NA | NA |
| AC025442.3 | ENSG00000253744 | ENSG00000253744 | NA | NA | NA |
| KIAA1683 | 80726 | ENSG00000130518 | KIAA1683 | NA | NA |
| RP11-458D21.1 | ENSG00000233396 | ENSG00000233396 | NA | NA | NA |
| LRRC59 | 55379 | ENSG00000108829 | leucine rich repeat containing 59 | NA | NA |
| TWF1P1 | ENSG00000178082 | ENSG00000178082 | twinfilin 1 pseudogene 1 | NA | NA |
| ZNF426 | 79088 | ENSG00000130818 | zinc finger protein 426 | Kaposi’s sarcoma-associated herpesvirus (KSHV) can be reactivated from latency by the viral protein RTA. The protein encoded by this gene is a zinc finger transcriptional repressor that interacts with RTA to modulate RTA-mediated reactivation of KSHV. While the encoded protein can repress KSHV reactivation, RTA can induce degradation of this protein through the ubiquitin-proteasome pathway to overcome the repression. Several transcript variants encoding different isoforms have been found for this gene. | NA |
| NRBP2 | 340371 | ENSG00000185189 | nuclear receptor binding protein 2 | NA | NA |
| RP11-127B20.3 | ENSG00000272677 | ENSG00000272677 | NA | NA | NA |
| RP11-299G20.2 | ENSG00000259172 | ENSG00000259172 | NA | NA | NA |
| YWHAZP3 | ENSG00000229932 | ENSG00000229932 | tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, zeta pseudogene 3 | NA | NA |
| CTD-2373J6.1 | ENSG00000260871 | ENSG00000260871 | NA | NA | NA |
| NA | NA | ENSG00000269942 | NA | NA | TRUE |
| RP11-16E12.2 | ENSG00000259772 | ENSG00000259772 | NA | NA | NA |
| SNHG11 | 128439 | ENSG00000174365 | small nucleolar RNA host gene 11 | This gene is a member of the non-protein-coding multiple snoRNA host gene family. Two snoRNAs are derived from the introns of this host gene. Although many alternative splice variants have been observed, the gene is thought to have no protein-coding potential. | NA |
| HESX1 | 8820 | ENSG00000163666 | HESX homeobox 1 | This gene encodes a conserved homeobox protein that is a transcriptional repressor in the developing forebrain and pituitary gland. Mutations in this gene are associated with septooptic dysplasia, HESX1-related growth hormone deficiency, and combined pituitary hormone deficiency. | NA |
| RP11-561C5.4 | ENSG00000229212 | ENSG00000229212 | NA | NA | NA |
| CTC-336P14.1 | ENSG00000271228 | ENSG00000271228 | NA | NA | NA |
| AC016722.4 | ENSG00000228925 | ENSG00000228925 | NA | NA | NA |
| RP11-1277A3.3 | ENSG00000272459 | ENSG00000272459 | NA | NA | NA |
| HSPH1 | 10808 | ENSG00000120694 | heat shock protein family H (Hsp110) member 1 | NA | NA |
| MAPK13 | 5603 | ENSG00000156711 | mitogen-activated protein kinase 13 | This gene encodes a member of the mitogen-activated protein (MAP) kinase family. MAP kinases act as an integration point for multiple biochemical signals, and are involved in a wide variety of cellular processes such as proliferation, differentiation, transcription regulation and development. The encoded protein is a p38 MAP kinase and is activated by proinflammatory cytokines and cellular stress. Substrates of the encoded protein include the transcription factor ATF2 and the microtubule dynamics regulator stathmin. Alternatively spliced transcript variants have been observed for this gene. | NA |
| AKR7L | ENSG00000211454 | ENSG00000211454 | aldo-keto reductase family 7-like (gene/pseudogene) | NA | NA |
| RP5-867C24.5 | ENSG00000261872 | ENSG00000261872 | NA | NA | NA |
| NA | NA | ENSG00000267167 | NA | NA | TRUE |
| TDGP1 | ENSG00000255725 | ENSG00000255725 | thymine-DNA glycosylase pseudogene 1 | NA | NA |
| MIR3652 | 100500842 | ENSG00000265072 | microRNA 3652 | microRNAs (miRNAs) are short (20-24 nt) non-coding RNAs that are involved in post-transcriptional regulation of gene expression in multicellular organisms by affecting both the stability and translation of mRNAs. miRNAs are transcribed by RNA polymerase II as part of capped and polyadenylated primary transcripts (pri-miRNAs) that can be either protein-coding or non-coding. The primary transcript is cleaved by the Drosha ribonuclease III enzyme to produce an approximately 70-nt stem-loop precursor miRNA (pre-miRNA), which is further cleaved by the cytoplasmic Dicer ribonuclease to generate the mature miRNA and antisense miRNA star (miRNA*) products. The mature miRNA is incorporated into a RNA-induced silencing complex (RISC), which recognizes target mRNAs through imperfect base pairing with the miRNA and most commonly results in translational inhibition or destabilization of the target mRNA. The RefSeq represents the predicted microRNA stem-loop. | NA |
| FOSL2 | 2355 | ENSG00000075426 | FOS like 2, AP-1 transcription factor subunit | The Fos gene family consists of 4 members: FOS, FOSB, FOSL1, and FOSL2. These genes encode leucine zipper proteins that can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1. As such, the FOS proteins have been implicated as regulators of cell proliferation, differentiation, and transformation. | NA |
| GNRH1 | 2796 | ENSG00000147437 | gonadotropin releasing hormone 1 | This gene encodes a preproprotein that is proteolytically processed to generate a peptide that is a member of the gonadotropin-releasing hormone (GnRH) family of peptides. Alternative splicing results in multiple transcript variants, at least one of which is secreted and then cleaved to generate gonadoliberin-1 and GnRH-associated peptide 1. Gonadoliberin-1 stimulates the release of luteinizing and follicle stimulating hormones, which are important for reproduction. Mutations in this gene are associated with hypogonadotropic hypogonadism. | NA |
| PRSS1 | 5644 | ENSG00000204983 | protease, serine 1 | This gene encodes a trypsinogen, which is a member of the trypsin family of serine proteases. This enzyme is secreted by the pancreas and cleaved to its active form in the small intestine. It is active on peptide linkages involving the carboxyl group of lysine or arginine. Mutations in this gene are associated with hereditary pancreatitis. This gene and several other trypsinogen genes are localized to the T cell receptor beta locus on chromosome 7. | NA |
| RDH5 | 5959 | ENSG00000135437 | retinol dehydrogenase 5 | This gene encodes an enzyme belonging to the short-chain dehydrogenases/reductases (SDR) family. This retinol dehydrogenase functions to catalyze the final step in the biosynthesis of 11-cis retinaldehyde, which is the universal chromophore of visual pigments. Mutations in this gene cause autosomal recessive fundus albipunctatus, a rare form of night blindness that is characterized by a delay in the regeneration of cone and rod photopigments. Alternative splicing results in multiple transcript variants. Read-through transcription also exists between this gene and the neighboring upstream BLOC1S1 (biogenesis of lysosomal organelles complex-1, subunit 1) gene. | NA |
| CTD-2035E11.5 | ENSG00000272144 | ENSG00000272144 | NA | NA | NA |
| NA | NA | ENSG00000272365 | NA | NA | TRUE |
| TMEM133 | 83935 | ENSG00000170647 | transmembrane protein 133 | There is evidence that this intronless gene is transcribed but the protein is predicted. The gene function is unknown. | NA |
| AC005540.3 | ENSG00000235852 | ENSG00000235852 | NA | NA | NA |
| NA | NA | ENSG00000261252 | NA | NA | TRUE |
| PDE6C | 5146 | ENSG00000095464 | phosphodiesterase 6C | This gene encodes the alpha-prime subunit of cone phosphodiesterase, which is composed of a homodimer of two alpha-prime subunits and 3 smaller proteins of 11, 13, and 15 kDa. Mutations in this gene are associated with cone dystrophy type 4 (COD4). | NA |
| LINC01089 | 338799 | ENSG00000212694 | long intergenic non-protein coding RNA 1089 | NA | NA |
| DPF3 | 8110 | ENSG00000205683 | double PHD fingers 3 | This gene encodes a member of the D4 protein family. The encoded protein is a transcription regulator that binds acetylated histones and is a component of the BAF chromatin remodeling complex. Alternate splicing results in multiple transcript variants encoding different isoforms. | NA |
| NR1H3 | 10062 | ENSG00000025434 | nuclear receptor subfamily 1 group H member 3 | The protein encoded by this gene belongs to the NR1 subfamily of the nuclear receptor superfamily. The NR1 family members are key regulators of macrophage function, controlling transcriptional programs involved in lipid homeostasis and inflammation. This protein is highly expressed in visceral organs, including liver, kidney and intestine. It forms a heterodimer with retinoid X receptor (RXR), and regulates expression of target genes containing retinoid response elements. Studies in mice lacking this gene suggest that it may play an important role in the regulation of cholesterol homeostasis. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. | NA |
| RAB37 | 326624 | ENSG00000172794 | RAB37, member RAS oncogene family | Rab proteins are low molecular mass GTPases that are critical regulators of vesicle trafficking. For additional background information on Rab proteins, see MIM 179508. | NA |
| GPR84 | 53831 | ENSG00000139572 | G protein-coupled receptor 84 | NA | NA |
| RP11-333E13.2 | ENSG00000250568 | ENSG00000250568 | NA | NA | NA |
| RP11-862L9.3 | ENSG00000266844 | ENSG00000266844 | NA | NA | NA |
| ZSWIM4 | 65249 | ENSG00000132003 | zinc finger SWIM-type containing 4 | NA | NA |
| C2orf82 | 389084 | ENSG00000182600 | chromosome 2 open reading frame 82 | NA | NA |
| HSP90AA2P | ENSG00000224411 | ENSG00000224411 | heat shock protein 90kDa alpha family class A member 2, pseudogene | NA | NA |
| AC079305.10 | ENSG00000222043 | ENSG00000222043 | NA | NA | NA |
| LOC171391 | 171391 | ENSG00000255284 | uncharacterized LOC171391 | NA | NA |
| RP11-408O19.5 | ENSG00000271631 | ENSG00000271631 | NA | NA | NA |
write.table(as.factor(out$query), paste0("../utilities/GTEX2013_sparse_fac_voom/gene_names_clus_",20,".txt"), col.names = FALSE,
row.names=FALSE, quote=FALSE);